4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
K10<br />
AMD K10<br />
List of instruction timings and macro-operation breakdown<br />
Explanation of column headings:<br />
<strong>Instruction</strong>:<br />
<strong>Instruction</strong> name. cc means any condition code. For example, Jcc can be JB,<br />
JNE, etc.<br />
Operands:<br />
i = immediate constant, r = any register, r32 = 32-bit register, etc., mm = 64 bit<br />
mmx register, xmm = 128 bit xmm register, sr = segment register, m = any<br />
memory operand including indirect operands, m64 means 64-bit memory operand,<br />
etc.<br />
Ops:<br />
Latency:<br />
Reciprocal throughput:<br />
Execution unit:<br />
Number of macro-operations issued from instruction decoder to schedulers. <strong>Instruction</strong>s<br />
with more than 2 macro-operations use microcode.<br />
This is the delay that the instruction generates in a dependency chain. The numbers<br />
are minimum values. Cache misses, misalignment, and exceptions may increase<br />
the clock counts considerably. Floating point operands are presumed to<br />
be normal numbers. Denormal numbers, NAN's, infinity and exceptions increase<br />
the delays. The latency listed does not include the memory operand where the<br />
operand is listed as register or memory (r/m).<br />
This is also called issue latency. This value indicates the average number of clock<br />
cycles from the execution of an instruction begins to a subsequent independent<br />
instruction of the same kind can begin to execute. A value of 1/3 indicates that the<br />
execution units can handle 3 instructions per clock cycle in one thread. However,<br />
the throughput may be limited by other bottlenecks in the pipeline.<br />
Indicates which execution unit is used for the macro-operations. ALU means any<br />
of the three integer ALU's. ALU0_1 means that ALU0 and ALU1 are both used.<br />
AGU means any of the three integer address generation units. FADD means floating<br />
point adder unit. FMUL means floating point multiplier unit. FMISC means<br />
floating point store and miscellaneous unit. FA/M means FADD or FMUL is used.<br />
FANY means any of the three floating point units can be used. Two macro-operations<br />
can execute simultaneously if they go to different execution units.<br />
Integer instructions<br />
<strong>Instruction</strong><br />
Move instructions<br />
Operands Ops Latency Reciprocal<br />
throughput<br />
Execution unit Notes<br />
MOV r,r 1 1 1/3 ALU<br />
MOV r,i 1 1 1/3 ALU<br />
MOV r8,m8 1 4 1/2 ALU, AGU Any addressing<br />
MOV<br />
MOV<br />
MOV<br />
r16,m16<br />
r32,m32<br />
r64,m64<br />
1<br />
1<br />
1<br />
4<br />
3<br />
3<br />
1/2<br />
1/2<br />
1/2<br />
ALU, AGU<br />
AGU<br />
AGU<br />
mode. Add 1 clock if<br />
code segment base<br />
≠ 0<br />
MOV m8,r8H 1 8 1/2 AGU AH, BH, CH, DH<br />
Any other 8-bit<br />
MOV m8,r8L 1 3 1/2 AGU register<br />
MOV m16/32/64,r 1 3 1/2 AGU Any addressing<br />
MOV m,i 1 3 1/2 AGU mode<br />
MOV m64,i32 1 3 1/2 AGU<br />
MOV r,sr 1 3-4 1/2<br />
MOV sr,r/m 6 8-26 8 from AMD manual<br />
Page 26