10.02.2013 Views

Instruction Throughput - Nvidia

Instruction Throughput - Nvidia

Instruction Throughput - Nvidia

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Simplified View of Latency and Syncs<br />

© NVIDIA Corporation 2011<br />

time<br />

Memory-only time<br />

Math-only time<br />

Kernel where most math cannot be<br />

executed until all data is loaded by<br />

the threadblock<br />

Full-kernel time, one large threadblock per SM<br />

Full-kernel time, two threadblocks per SM<br />

(each half the size of one large one)<br />

20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!