Instruction Throughput - GPU Technology Conference
Instruction Throughput - GPU Technology Conference Instruction Throughput - GPU Technology Conference
Summary • Determining what limits your kernel most: • Arithmetic, memory bandwidth (Oct 11th), latency, register spilling (Oct 4th) • Address the bottlenecks in the order of importance • Analyze for inefficient use of hardware • Estimate the impact on overall performance • Optimize to use hardware most efficiently • More resources: • Profiler documentation! (Especially on derived profiler values) • Prior CUDA tutorials at Supercomputing - http://gpgpu.org/{sc2007,sc2008,sc2009,sc2010} • GTC2010 talks: http://www.nvidia.com/gtc2010-content • CUDA Programming Guide, CUDA Best Practices Guide © NVIDIA Corporation 2011 22
- Page 1 and 2: Instruction Limited Kernels CUDA Op
- Page 3 and 4: Presentation Outline • Identifyin
- Page 5 and 6: Limited by Bandwidth or Arithmetic?
- Page 7 and 8: Optimizations for Instruction Throu
- Page 9 and 10: Instruction Throughput: Analysis
- Page 11 and 12: Serialization: Profiler Analysis
- Page 13 and 14: Serialization: Analysis with Modifi
- Page 15 and 16: Case Study: SMEM Bank Conflicts •
- Page 17 and 18: Optimizations for Latency © NVIDIA
- Page 19 and 20: Simplified View of Latency and Sync
- Page 21: Latency: Optimization • Insuffici
Summary<br />
• Determining what limits your kernel most:<br />
• Arithmetic, memory bandwidth (Oct 11th), latency, register spilling (Oct 4th)<br />
• Address the bottlenecks in the order of importance<br />
• Analyze for inefficient use of hardware<br />
• Estimate the impact on overall performance<br />
• Optimize to use hardware most efficiently<br />
• More resources:<br />
• Profiler documentation! (Especially on derived profiler values)<br />
• Prior CUDA tutorials at Supercomputing<br />
- http://gpgpu.org/{sc2007,sc2008,sc2009,sc2010}<br />
• GTC2010 talks: http://www.nvidia.com/gtc2010-content<br />
• CUDA Programming Guide, CUDA Best Practices Guide<br />
© NVIDIA Corporation 2011<br />
22