4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
4 Instruction tables - Agner Fog
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Prescott<br />
RDPMC (bit 31 = 1) 1 37 100 p5<br />
RDPMC (bit 31 = 0) 4 154 240 p5<br />
MONITOR (sse3)<br />
MWAIT<br />
Notes:<br />
(sse3)<br />
a) Add 1 μop if source is a memory operand.<br />
b) Uses an extra μop (port 3) if SIB byte used.<br />
c)<br />
Add 1 μop if source or destination, but not both, is a high 8-bit register (AH, BH,<br />
CH, DH).<br />
d) Has (false) dependence on the flags in most cases.<br />
e) Not available on PMMX<br />
l)<br />
Move accumulator to/from memory with 64 bit absolute address (opcode A0 -<br />
A3).<br />
m) Not available in 64 bit mode.<br />
n) Not available in 64 bit mode on some processors.<br />
o)<br />
MOVSX uses an extra μop if the destination register is smaller than the biggest<br />
register size available. Use a 32 bit destination register in 16 bit and 32 bit<br />
mode, and a 64 bit destination register in 64 bit mode for optimal performance.<br />
p)<br />
LEA with a direct memory operand has 1 μop and a reciprocal throughput of<br />
0.25. This also applies if there is a RIP-relative address in 64-bit mode. A signextended<br />
32-bit direct memory operand in 64-bit mode without RIP-relative address<br />
takes 2 μops because of the SIB byte. The throughput is 1 in this case.<br />
You may use a MOV instead.<br />
q)<br />
Floating point x87 instructions<br />
<strong>Instruction</strong> Operands<br />
These values are measured in 32-bit mode. In 16-bit real mode there is 1 microcode<br />
μop and a reciprocal throughput of 17.<br />
μops<br />
Microcode<br />
Latency<br />
Page 149<br />
Additional latency<br />
Reciprocal throughput<br />
Move instructions<br />
FLD r 1 0 7 0 1 0 mov 87<br />
FLD m32/64 1 0 0 1 2 load 87<br />
FLD m80 3 3 8 2 load 87<br />
FBLD m80 3 74 90 2 load 87<br />
FST(P) r 1 0 7 0 1 0 mov 87<br />
FST(P) m32/64 2 0 7 2 0 store 87<br />
FSTP m80 3 6 10 0 store 87<br />
FBSTP m80 3 311 400 0 store 87<br />
FXCH r 1 0 0 0 1 0 mov 87<br />
FILD m16 3 2 8 2 load 87<br />
FILD m32/64 2 0 2 2 load 87<br />
FIST(P) m 3 0 2.5 0 store 87<br />
FISTTP m 3 0 2.5 0 store sse3<br />
FLDZ 1 0 2 0 mov 87<br />
Port<br />
Execution unit<br />
Subunit<br />
<strong>Instruction</strong> set<br />
Notes