4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Latency:<br />
Reciprocal throughput:<br />
Nehalem<br />
This is the delay that the instruction generates in a dependency chain. The<br />
numbers are minimum values. Cache misses, misalignment, and exceptions<br />
may increase the clock counts considerably. Floating point operands are presumed<br />
to be normal numbers. Denormal numbers, NAN's and infinity increase<br />
the delays very much, except in XMM move, shuffle and Boolean instructions.<br />
Floating point overflow, underflow, denormal or NAN results give a similar<br />
delay. The time unit used is core clock cycles, not the reference clock cycles<br />
given by the time stamp counter.<br />
The average number of core clock cycles per instruction for a series of independent<br />
instructions of the same kind in the same thread.<br />
Integer instructions<br />
<strong>Instruction</strong> Operands μops μops unfused domain DoLatenReci- fused<br />
maincyprocaldomain p015 p0 p1 p5 p2 p3 p4<br />
throughput<br />
Move instructions<br />
MOV r,r/i 1 1 x x x int 1 0.33<br />
MOV a) r,m 1 1 int 2 1<br />
MOV a) m,r 1 1 1 int 3 1<br />
MOV m,i 1 1 1 int 3 1<br />
MOV r,sr 1 1 int 1<br />
MOV m,sr 2 1 1 1 int 1<br />
MOV sr,r 6 3 x x x 3 int 13<br />
MOV sr,m 6 2 x x 4 int 14<br />
MOVNTI<br />
MOVSX MOVZX<br />
m,r 2 1 1 int ~270 1<br />
MOVSXD<br />
MOVSX MOVZX<br />
r,r 1 1 x x x int 1 0.33<br />
MOVSXD<br />
r,m 1 1 int 1<br />
CMOVcc r,r 2 2 x x x int 2 1<br />
CMOVcc r,m 2 2 x x x 1 int<br />
XCHG r,r 3 3 x x x int 2 2<br />
XCHG r,m 7 1 1 1 int 20 b)<br />
XLAT 2 1 1 int 5 1<br />
PUSH r 1 1 1 int 3 1<br />
PUSH i 1 1 1 int 1<br />
PUSH m 2 1 1 1 int 1<br />
PUSH sr 2 1 1 1 int 1<br />
PUSHF(D/Q) 3 2 x x x 1 1 int 1<br />
PUSHA(D) i) 18 2 x 1 x 8 8 int 8<br />
POP r 1 1 int 2 1<br />
POP (E/R)SP 3 2 x 1 x 1 int 5<br />
POP m 2 1 1 1 int 1<br />
POP sr 7 2 5 int 15<br />
POPF(D/Q) 8 7 x x x 1 int 14<br />
POPA(D) i) 10 2 8 int 8<br />
LAHF SAHF 1 1 x x x int 1 0.33<br />
SALC i) 2 2 x x x int 4 1<br />
LEA a) r,m 1 1 1 int 1 1<br />
BSWAP r32 1 1 1 int 1 1<br />
Page 107