03.03.2013 Views

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

4 Instruction tables - Agner Fog

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Prescott<br />

RDPMC (bit 31 = 1) 1 37 100 p5<br />

RDPMC (bit 31 = 0) 4 154 240 p5<br />

MONITOR (sse3)<br />

MWAIT<br />

Notes:<br />

(sse3)<br />

a) Add 1 μop if source is a memory operand.<br />

b) Uses an extra μop (port 3) if SIB byte used.<br />

c)<br />

Add 1 μop if source or destination, but not both, is a high 8-bit register (AH, BH,<br />

CH, DH).<br />

d) Has (false) dependence on the flags in most cases.<br />

e) Not available on PMMX<br />

l)<br />

Move accumulator to/from memory with 64 bit absolute address (opcode A0 -<br />

A3).<br />

m) Not available in 64 bit mode.<br />

n) Not available in 64 bit mode on some processors.<br />

o)<br />

MOVSX uses an extra μop if the destination register is smaller than the biggest<br />

register size available. Use a 32 bit destination register in 16 bit and 32 bit<br />

mode, and a 64 bit destination register in 64 bit mode for optimal performance.<br />

p)<br />

LEA with a direct memory operand has 1 μop and a reciprocal throughput of<br />

0.25. This also applies if there is a RIP-relative address in 64-bit mode. A signextended<br />

32-bit direct memory operand in 64-bit mode without RIP-relative address<br />

takes 2 μops because of the SIB byte. The throughput is 1 in this case.<br />

You may use a MOV instead.<br />

q)<br />

Floating point x87 instructions<br />

<strong>Instruction</strong> Operands<br />

These values are measured in 32-bit mode. In 16-bit real mode there is 1 microcode<br />

μop and a reciprocal throughput of 17.<br />

μops<br />

Microcode<br />

Latency<br />

Page 149<br />

Additional latency<br />

Reciprocal throughput<br />

Move instructions<br />

FLD r 1 0 7 0 1 0 mov 87<br />

FLD m32/64 1 0 0 1 2 load 87<br />

FLD m80 3 3 8 2 load 87<br />

FBLD m80 3 74 90 2 load 87<br />

FST(P) r 1 0 7 0 1 0 mov 87<br />

FST(P) m32/64 2 0 7 2 0 store 87<br />

FSTP m80 3 6 10 0 store 87<br />

FBSTP m80 3 311 400 0 store 87<br />

FXCH r 1 0 0 0 1 0 mov 87<br />

FILD m16 3 2 8 2 load 87<br />

FILD m32/64 2 0 2 2 load 87<br />

FIST(P) m 3 0 2.5 0 store 87<br />

FISTTP m 3 0 2.5 0 store sse3<br />

FLDZ 1 0 2 0 mov 87<br />

Port<br />

Execution unit<br />

Subunit<br />

<strong>Instruction</strong> set<br />

Notes

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!