03.03.2013 Views

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Pentium M<br />

Intel Pentium M, Core Solo and Core Duo<br />

List of instruction timings and μop breakdown<br />

Explanation of column headings:<br />

Operands:<br />

i = immediate data, r = register, mm = 64 bit mmx register, xmm =<br />

128 bit xmm register, sr = segment register, m = memory, m32 =<br />

32-bit memory operand, etc.<br />

μops fused domain: The number of μops at the decode, rename, allocate and retirement<br />

stages in the pipeline. Fused μops count as one.<br />

μops unfused domain: The number of μops for each execution port. Fused μops count as<br />

two.<br />

p0: Port 0: ALU, etc.<br />

p1: Port 1: ALU, jumps<br />

p01:<br />

<strong>Instruction</strong>s that can go to either port 0 or 1, whichever is vacant<br />

first.<br />

p2: Port 2: load data, etc.<br />

p3: Port 3: address generation for store<br />

p4: Port 4: store data<br />

Latency:<br />

This is the delay that the instruction generates in a dependency<br />

chain. (This is not the same as the time spent in the execution<br />

unit. Values may be inaccurate in situations where they cannot be<br />

measured exactly, especially with memory operands). The numbers<br />

are minimum values. Cache misses, misalignment, and exceptions<br />

may increase the clock counts considerably. Floating<br />

point operands are presumed to be normal numbers. Denormal<br />

numbers, NAN's and infinity increase the delays by 50-150 clocks,<br />

except in XMM move, shuffle and Boolean instructions. Floating<br />

point overflow, underflow, denormal or NAN results give a similar<br />

delay.<br />

Reciprocal throughput:<br />

Integer instructions<br />

<strong>Instruction</strong> Operands μops μops unfused domain Latency<br />

fused<br />

domain p0 p1 p01 p2 p3 p4<br />

Move instructions<br />

The average number of clock cycles per instruction for a series of<br />

independent instructions of the same kind.<br />

MOV r,r/i 1 1 0.5<br />

MOV r,m 1 1 1<br />

MOV m,r 1 1 1 1<br />

MOV m,i 2 1 1 1<br />

MOV r,sr 1 1<br />

MOV m,sr 2 1 1 1<br />

MOV sr,r 8 8 5<br />

MOV sr,m 8 7 1 8<br />

MOVNTI m,r32 2 1 1 2<br />

MOVSX MOVZX r,r 1 1 1 0.5<br />

MOVSX MOVZX r,m 1 1 1<br />

CMOVcc r,r 2 1 1 2 1.5<br />

CMOVcc r,m 2 1 1 1<br />

Page 71<br />

Reciprocal<br />

through<br />

put

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!