03.03.2013 Views

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

AMD K7<br />

AMD K7<br />

List of instruction timings and macro-operation breakdown<br />

Explanation of column headings:<br />

<strong>Instruction</strong>:<br />

<strong>Instruction</strong> name. cc means any condition code. For example, Jcc can be JB,<br />

JNE, etc.<br />

Operands:<br />

i = immediate constant, r = any register, r32 = 32-bit register, etc., mm = 64 bit<br />

mmx register, xmm = 128 bit xmm register, sr = segment register, m = any<br />

memory operand including indirect operands, m64 means 64-bit memory operand,<br />

etc.<br />

Ops:<br />

Latency:<br />

Reciprocal throughput:<br />

Execution unit:<br />

Number of macro-operations issued from instruction decoder to schedulers. <strong>Instruction</strong>s<br />

with more than 2 macro-operations use microcode.<br />

This is the delay that the instruction generates in a dependency chain. The<br />

numbers are minimum values. Cache misses, misalignment, and exceptions<br />

may increase the clock counts considerably. Floating point operands are presumed<br />

to be normal numbers. Denormal numbers, NAN's, infinity and exceptions<br />

increase the delays. The latency listed does not include the memory operand<br />

where the operand is listed as register or memory (r/m).<br />

This is also called issue latency. This value indicates the average number of<br />

clock cycles from the execution of an instruction begins to a subsequent independent<br />

instruction of the same kind can begin to execute. A value of 1/3 indicates<br />

that the execution units can handle 3 instructions per clock cycle in one<br />

thread. However, the throughput may be limited by other bottlenecks in the<br />

pipeline.<br />

Indicates which execution unit is used for the macro-operations. ALU means<br />

any of the three integer ALU's. ALU0_1 means that ALU0 and ALU1 are both<br />

used. AGU means any of the three integer address generation units. FADD<br />

means floating point adder unit. FMUL means floating point multiplier unit.<br />

FMISC means floating point store and miscellaneous unit. FA/M means FADD<br />

or FMUL is used. FANY means any of the three floating point units can be<br />

used. Two macro-operations can execute simultaneously if they go to different<br />

execution units.<br />

Integer instructions<br />

<strong>Instruction</strong> Operands Ops Latency Reciprocal Execution Notes<br />

Move instructions<br />

throughput unit<br />

MOV r,r 1 1 1/3 ALU<br />

MOV r,i 1 1 1/3 ALU<br />

Any addr. mode.<br />

Add 1 clk if code<br />

segment base ≠<br />

MOV r8,m8 1 4 1/2 ALU, AGU 0<br />

MOV r16,m16 1 4 1/2 ALU, AGU do.<br />

MOV r32,m32 1 3 1/2 AGU do.<br />

MOV m8,r8H 1 8 1/2 AGU AH, BH, CH, DH<br />

Any other 8-bit<br />

MOV m8,r8L 1 2 1/2 AGU register<br />

Any addressing<br />

MOV m16/32,r 1 2 1/2 AGU mode<br />

MOV m,i 1 2 1/2 AGU<br />

MOV r,sr 1 2 1<br />

Page 7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!