4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Pentium M<br />
Intel Pentium M, Core Solo and Core Duo<br />
List of instruction timings and μop breakdown<br />
Explanation of column headings:<br />
Operands:<br />
i = immediate data, r = register, mm = 64 bit mmx register, xmm =<br />
128 bit xmm register, sr = segment register, m = memory, m32 =<br />
32-bit memory operand, etc.<br />
μops fused domain: The number of μops at the decode, rename, allocate and retirement<br />
stages in the pipeline. Fused μops count as one.<br />
μops unfused domain: The number of μops for each execution port. Fused μops count as<br />
two.<br />
p0: Port 0: ALU, etc.<br />
p1: Port 1: ALU, jumps<br />
p01:<br />
<strong>Instruction</strong>s that can go to either port 0 or 1, whichever is vacant<br />
first.<br />
p2: Port 2: load data, etc.<br />
p3: Port 3: address generation for store<br />
p4: Port 4: store data<br />
Latency:<br />
This is the delay that the instruction generates in a dependency<br />
chain. (This is not the same as the time spent in the execution<br />
unit. Values may be inaccurate in situations where they cannot be<br />
measured exactly, especially with memory operands). The numbers<br />
are minimum values. Cache misses, misalignment, and exceptions<br />
may increase the clock counts considerably. Floating<br />
point operands are presumed to be normal numbers. Denormal<br />
numbers, NAN's and infinity increase the delays by 50-150 clocks,<br />
except in XMM move, shuffle and Boolean instructions. Floating<br />
point overflow, underflow, denormal or NAN results give a similar<br />
delay.<br />
Reciprocal throughput:<br />
Integer instructions<br />
<strong>Instruction</strong> Operands μops μops unfused domain Latency<br />
fused<br />
domain p0 p1 p01 p2 p3 p4<br />
Move instructions<br />
The average number of clock cycles per instruction for a series of<br />
independent instructions of the same kind.<br />
MOV r,r/i 1 1 0.5<br />
MOV r,m 1 1 1<br />
MOV m,r 1 1 1 1<br />
MOV m,i 2 1 1 1<br />
MOV r,sr 1 1<br />
MOV m,sr 2 1 1 1<br />
MOV sr,r 8 8 5<br />
MOV sr,m 8 7 1 8<br />
MOVNTI m,r32 2 1 1 2<br />
MOVSX MOVZX r,r 1 1 1 0.5<br />
MOVSX MOVZX r,m 1 1 1<br />
CMOVcc r,r 2 1 1 2 1.5<br />
CMOVcc r,m 2 1 1 1<br />
Page 71<br />
Reciprocal<br />
through<br />
put