2H 2015

07.12.2015 Views
Comparative Performance 1 0 BLAST* BLASTn v.30 1 2S Intel® Xeon® processor E5-2697 v2 (BLASTn v.30 baseline) 2S Xeon E5-2697 v2 + Intel® Xeon Phi coprocessor 7120A 2S Xeon E5-2697 v2 + Xeon Phi 7120A (OFS parallelized) 2S Intel® Xeon® processor E5-2697 v3 (BLASTn v.30 baseline) 2S Xeon E5-2697 v3 + Intel® Xeon Phi coprocessor 7120A 2S Xeon E5-2697 v3 + Xeon Phi 7120A (OFS parallelized) “Xeon E5-2697 v2/v3” = Intel® Xeon® processor E5-2697 v2/v3 “Xeon Phi 7120A” = Intel® Xeon Phi coprocessor 7120A For configuration details, go here. BLASTn* v.30 Speed Up 1.26X 1.41X 1.22X 1.49X 1.52X SOURCE: INTEL MEASURED RESULTS AS OF MARCH, 2015 Application: Basic Local Alignment Search Tool (BLASTn) v.30. Description: Searching for alignment in nucleotide query sequences against a known nucleotide db volume set. National Center for Biotechnology Information (NCBI*). More at http://blast.ncbi.nlm.nih.gov/. Availability: • Code: Available here. • Recipe: Available here. Usage Model: #4 (multiple queries multiple db) 100 NCBI queries (concatenated) against db refseq_rna.00-02 are distributed to the Intel® Xeon® processor and Intel® Xeon Phi coprocessor for maximum speedup sweet spot. Experiment was repeated 20 times with the pick of queries randomized for a sweet spot split 80/20 and 59/23/18. Highlights: Throughput for this load sharing model has a small sweet spot for a sufficiently large query set. Results: Compared to the baseline, simulation rate speed up with Intel® Xeon® processor E5-2697 v3 and Intel® Xeon Phi coprocessor 7120A heterogeneous model is 1.52X. Performance is also improved on the CPU due to Output Formatting Section (OFS) parallelization. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others . 1 NODE APPROVED FOR PUBLIC PRESENTATION NEW 18

Comparative Performance 1 0 BLAST* BLASTp v.30 1 2S Intel® Xeon® processor E5-2697 v2 (BLASTp v.30 baseline) 2S Xeon E5-2697 v2 + Intel® Xeon Phi coprocessor 7120A 2S Xeon E5-2697 v2 + Xeon Phi 7120A (OFS parallelized) 2S Intel® Xeon® processor E5-2697 v3 (BLASTp v.30 baseline) 2S Xeon E5-2697 v3 + Intel® Xeon Phi coprocessor 7120A 2S Xeon E5-2697 v3 + Xeon Phi 7120A (OFS parallelized) For configuration details, go here. BLASTp* v.30 Speed Up 1.15X 1.3X 1.21X 1.39X “Xeon E5-2697 v2/v3” = Intel® Xeon® processor E5-2697 v2/v3 “Xeon Phi 7120A” = Intel® Xeon Phi coprocessor 7120A 1.41X SOURCE: INTEL MEASURED RESULTS AS OF MARCH, 2015 Application: Basic Local Alignment Search Tool (BLASTp) v.30 Description: Searching for alignment in protein query sequence against a known protein db volume set. More at http://blast.ncbi.nlm.nih.gov/. Availability: • Code: Available here. • Recipe: Available here. • Usage Model: #4 (multiple queries multiple db) 40 NCBI queries (concatenated) against db nr_sorted.00-02 are distributed to Intel® Xeon® processor and Intel® Xeon Phi coprocessor for maximum speedup sweet spot. Experiment was repeated 20 times with the pick of queries randomized for a sweet spot split 33/7 and 28/5/7. Highlights: Throughput for this offload model has a small sweet spot for a sufficiently large query set. Throughput is limited due to GAT stage not parallelized. Results: Compared to the baseline, simulation rate speed up with Intel® Xeon® processor E5-2697 v3 and Intel® Xeon Phi coprocessor 7120A heterogeneous model is 1.41X. Performance is also improved on the CPU due to Output Formatting Section (OFS) parallelization. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others . 1 NODE APPROVED FOR PUBLIC PRESENTATION NEW 19

Page 1 and 2: 2H 2015

Page 3 and 4: Intel® Modern Code Developer Chall

Page 5 and 6: New or Updated Proof Points NEW pro

Page 7 and 8: Intel® Xeon Phi Coprocessors Softw

Page 9 and 10: Intel® Xeon® Processor E5-2697 v2

Page 11 and 12: Memory Capacity (GB) Memory Compari

Page 13 and 14: A Growing Ecosystem: The Intel® Xe

Page 15 and 16: Comparative Performance LAMMPS* Sti

Page 17: Comparative Performance Johns Hopki

Page 21 and 22: Comparative Performance NAMD* 2.10

Page 23 and 24: Comparative Performance LAMMPS* Liq

Page 25 and 26: Comparative Performance LAMMPS* Rho

Page 27 and 28: Comparative Performance LAMMPS* Liq

Page 29 and 30: Comparative Performance LAMMPS* Rho

Page 31 and 32: Comparative Performance LAMMPS* Pro

Page 33 and 34: Comparative Performance AMBER* 14 P

Page 35 and 36: Comparative Performance AMBER* 14 P

Page 37 and 38: Comparative Performance Burrows-Whe

Page 39 and 40: Comparative Performance NWChem* CCS

Page 41 and 42: Discover and design like never befo

Page 43 and 44: Comparative Performance miniGhost*

Page 45 and 46: Comparative Performance Quantum ESP

Page 47 and 48: Comparative Performance ANSYS Mecha



Page 53 and 54: Comparative Performance Sandia Mant

Page 55 and 56: Comparative Increase Autodesk Maya*

Page 57 and 58: Comparative Performance OpenLB* Cyl

Page 59 and 60: CLUSTER BENCHMARKS New Data Center

Page 61 and 62: Comparative Performance Monte Carlo

Page 63 and 64: Comparative Performance QuantLib* S




Page 71 and 72: Comparative Performance Xcelerit* L

Page 73 and 74: Comparative Increase 1 0 Iso3DFD* 1

Page 75 and 76: Comparative Performance Petrobras*

Page 77 and 78: CLUSTER BENCHMARK Data Center Serve

Page 79 and 80: Comparative Performance BerkeleyGW*

Page 81 and 82: Comparative Performance ASKAP* tHog

Page 83 and 84: Comparative Increase specfem3D 300K

Page 85 and 86: CLUSTER BENCHMARK 6,400 NODES APPRO

Page 87 and 88: Comparative Performance Gyrokinetic

Page 89 and 90: Comparative Increase ROMS* Idealize

Page 91 and 92: Comparative Performance NASA* OVERF

Page 93 and 94: Improving speed and quality through

Page 95 and 96: Comparative Performance Embree 2.2

Page 97 and 98: Intel® Software Development Tools

Page 99 and 100: Features and Configurations Intel®

Page 101 and 102: Speedup Turn Big Data Into Informat

Page 103 and 104: Scalable Profiling for MPI and Hybr

Page 105 and 106: Bright Cluster Manager* Advanced Cl

Page 107 and 108: Intel® Xeon Phi Coprocessor Develo

Page 109 and 110: Intel® Developer Zone Join us on S

Page 111 and 112: Recommended Links Getting Started:

Page 113 and 114: Hardware Configuration - Intel® Xe

Page 115 and 116: OPTIMIZATION NOTICE Optimization No

xeon

processor

intel

coprocessor

software

measured

optimized

node

workloads

operations

2H 2015

intel-xeon-phi-sw-ecosystem-guide-2h-2015-public3 ... View more intel-xeon-phi-sw-ecosystem-guide-2h-2015-public3

Delete template?

Save as template ?

intel-xeon-phi-sw-ecosystem-guide-2h-2015-public3 intel-xeon-phi-sw-ecosystem-guide-2h-2015-public3