Paging

Paging Paging

from support.dce.felk.cvut.cz More from this publisher

12.07.2015 Views

Microcomputers:x86 Architecture BasicsDr. Konstantin Levit-Gurevich,Intel IsraelSpring Semester 2006Dr. K. Levit-Gurevich,May 31, 2006, p-1MicrocomputersKonstantin Levit-Gurevich 1

Special Project●x86 Simulator♦ Development of a simple CPU simulator♦ Main features, easy extensible♦ To be used by the students in Microcomputers course for exercises andhomework.♦ 1-2 people●Bonus 1: exemption from the final exam and SW exercises●Extension - HW part♦ Cache Policy Monitoring♦ Small “Device Manager”Examing PCI devices existing in the real system●Bonus 2: exemption from the final exam and all exercisesDr. K. Levit-Gurevich,May 31, 2006, p-5MicrocomputersKonstantin Levit-Gurevich 5

Objectives●Get acquainted with the X86 architecture and system components♦ Historical perspective♦ Major features♦ SW/HW aspects●What you will not get here♦ Expertise in X86 programming♦ Full details of everything●Expect♦ Varying level of details♦ Hopefully: taste of moreDr. K. Levit-Gurevich,May 31, 2006, p-6MicrocomputersKonstantin Levit-Gurevich 6

X86 EvolutionIntel40044 bit197532-bit registers 32-Intel 8086 bit bus32-bitaddressing Paging16-bit registers Mechanism (4GB16-bit bus 20- space)bit addressing(1MB space) Segmentation1969-1971 Intel80801978Dr. K. Levit-Gurevich,May 31, 2006, p-91982Intel286Intel38619851989Intel486Pentium®Superscalarpipelines)1993Parallel ExecutionFive pipeline stagesProtected mode Operation Integrated x87 FPU24-bitaddressing (16MBspace)8 bit(twoAPICPentium®with MMXTechnology19951996Pentium® ProThree-way superscalarOut-of-order executionSuperior branch predictionSpeculative executionExpanded 36-bitphysical address(64GB space)Pentium® III19971999Pentium® IIStreamingSIMDExtensions 2000 -(SSE)2002Pentium® MCentrinoWirelessTechnology2003Pentium® 4SSE2Hyper-ThreadingTechnology2004Extended MemoryTechnologyEM64T64-bit registers64-bit addressing2005Dual CoreVirtualization TechnologyMicrocomputersKonstantin Levit-Gurevich 9

Trends●Phase-I (-1970): “Enter the game”♦ Technology progress●Phase-II (1970-1990): “Imitate the big guys”Use technology and copy ideas from mainframes♦ Enlarge word size♦ FP hardware♦ Paging, Protection, Caching♦ High integration●Phase-III (1990-): “Lead the game”Innovation unique to microcomputers:♦ Super-scalar, out-of-order execution♦ Branch prediction♦ High frequency♦ Networking♦ GraphicsDr. K. Levit-Gurevich,May 31, 2006, p-10MicrocomputersKonstantin Levit-Gurevich 10

Performance Observations●“The number of transistors that would be incorporated on asilicon die would double every 18 months for the next severalyears”.Gordon Moore, Intel co-founder. Mid-1960●The “Spiral” principle (A. Grove):New software is written to the fastest available processor. Actually,it takes the next generation of processors to run this software(reasonably).●There will always be applications that will use all availablecomputing power."My nightmare is to wake up some day and not need any morecomputing power."Gordon Moore, Intel co-founder. Feb 1998Dr. K. Levit-Gurevich,May 31, 2006, p-12MicrocomputersKonstantin Levit-Gurevich 12

Technology Innovations:Moore’s Law Continues386 Processor8086Pentium® Processor286Itanium® 2 ProcessorItanium® ProcessorPentium® III Processor486 DX Processortransistors1,000,000,000100,000,000Pentium® 4Processor10,000,000Pentium® II Processor1,000,000100,0004004 8080800810,0001,0001970 1980 1990 2000 2010Dr. K. Levit-Gurevich,May 31, 2006, p-13MicrocomputersKonstantin Source: Levit-Gurevich Intel13

X86 Evolution - DetailedDate1978Processor8086Transistors29 KBits16Bus16MaxPhysicalMemory1 MBFrequency8 MHz197919821985198980888028680386Intel 48629 K134 K275 K1.2 M1616323281632321 MB16 MB4 GB4 GB8 MHz12.5 MHz20 MHz25 MHz199319951997PentiumPentium ProPentium II3.1 M5.5 M7 M3232326464644/64 GB4/64 GB4/64 GB60 MHz200 MHz266 MHz1999200020022003Pentium IIIPentium 4Pentium 4PentiumHTM8.2 M42 M55 M77 M32323232646464644/64 GB4/64 GB4/64 GB4/64 GB500 MHz1.5 GHz3.0 GHZ1.6 GHzDr. K. Levit-Gurevich,May 31, 2006, p-14MicrocomputersKonstantin Levit-Gurevich 14

Computer SystemCPUExternalGraphicsCardLANConnect/ASFCRTx16PCI Express* PortUSB 2.0 (8)ATA100 (1)SATA (4 )4 LanesGMCHIntel ®ICHFSBDirect Media Interface (DMI)PCI BusDDR 333/400DDR2 400/533DDR 333/400DDR2 400/5334 x1PCI ExpressportsA/M Codec 1A/M Codec 2ACLPC I/FSuper I/OA/M Codec 3Firmware Hub(s)(up to 4)Dr. K. Levit-Gurevich,May 31, 2006, p-15MicrocomputersKonstantin Levit-Gurevich 15

Architecture & μ Architecture●Architecture“The science, art, or profession of designing and constructingbuilding, bridges, etc...”♦ Focus on how it looks externally♦ Consider how it can be built and how good it performs!●Construction“The way in which something is built, formed or devised by fittingparts or elements together systematically”♦ Focus on how it is built/operated♦ Make sure it works and works correctly!●Definition applicable also for buildings, cars, computers, etc...In computers: construction == µArchitectureDr. K. Levit-Gurevich,May 31, 2006, p-16MicrocomputersKonstantin Levit-Gurevich 16

Architecture & μArchitecture●ArchitectureThe collection of features of a processor (or a system) as they areseen by the “user”♦ User: a binary executable running on that processor, or♦ assembly level programmer●μArchitectureThe collection of features or way of implementation of a processor(or a system) do not affect the user.●Features which change Timing/performance are consideredmicroarchitecture.Dr. K. Levit-Gurevich,May 31, 2006, p-17MicrocomputersKonstantin Levit-Gurevich 17

Dr. K. Levit-Gurevich,May 31, 2006, p-18Architecture & µArchitecture Elements●Architecture♦ Registers data width (8/16/32/64)♦ Instruction set♦ Addressing modes♦ Addressing methods (Segmentation, Paging, etc...)♦ Protection♦ etc…●µArchitecture♦ Bus width♦ Physical memory size♦ Caches & Cache size♦ Number of execution units♦ Execution Pipelines, Number & type execution units♦ Branch prediction♦ TLB♦ etc...●Timing is considered µArchitecture (though it is user visible!)MicrocomputersKonstantin Levit-Gurevich 18

Compatibility●Processor A (or system A) is compatible to processor B (or system B) if itpreserves the same architecture features as seen by a user of B●SW compatible - most important for us!♦ The ability to run existing SW on new HW!●Architecture Extension♦ addition of data-types, instructions, etc… to enable better support ofnew/existing applications.e.g. MMX●Trends:♦ 1985-1995: no/minimal architecture changes♦ 1995- : we see them coming (MMX, SSE, EM64T, VT, etc.)Dr. K. Levit-Gurevich,May 31, 2006, p-19MicrocomputersKonstantin Levit-Gurevich 19

Philosophical Basis of First Wave Architecture● HW / Instruction level support for high-level languages♦ Functions – main(), f(int x)♦ Loops – for, while♦ Conditional jumps– if-else♦ Unconditional jumps– goto, continue, break♦ Increments and decrements– i++, i--♦ Support operations with immediate value – a = 10;♦ Support Read-Modify-Write operations – a += b;● Instruction Set encoding for memory efficiency♦ More frequent instructions – shorter encoding♦ Two explicit operands instructionsREG-REG, REG-IMM, REG-Memory, Read-Modify-Write operations♦ Specialized and Implicit Register Usage♦ Complex Instructions to maximize functionality / instruction fetchDr. K. Levit-Gurevich,May 31, 2006, p-20MicrocomputersKonstantin Levit-Gurevich 20

Philosophical Basis of First Wave Architecture●Memory Segmentation♦ Relocatable programs / modules well supported♦ Stack and Data spaces in instruction set architectureLocal and global variables●Addressing modes suitable for high-level languages♦ Pointers – *p♦ Structure fields addressing– struct.field♦ Array elements addressing– array[i]♦ Special instructions addressing stack – PUSH, POP, PUSHF, POPFDr. K. Levit-Gurevich,May 31, 2006, p-21MicrocomputersKonstantin Levit-Gurevich 21

“C” Language Code Examplestruct Str{=> Structuresint x, y;} array[100]; => Arraysvoid func(struct Str* s)=> Functions,{ pointersint i;=> Local variablessaved on stack}for (i = 0; i < 100; i++) => Loops, index{ incrementarray[i].x += s->x + i; => Structure fieldarray[i].y += s->y + i; access}Dr. K. Levit-Gurevich,May 31, 2006, p-22MicrocomputersKonstantin Levit-Gurevich 22

X86 Architecture BasicsDr. K. Levit-Gurevich,May 31, 2006, p-23MicrocomputersKonstantin Levit-Gurevich 23

Fundamental Data Types7 0ByteN15 8HighByteN+17 0LowByteNWord31 16High WordN+215 0Low WordNDoubleword63 32High DoublewordN+431 0Low DoublewordNQuadword127 64Dr. K. Levit-Gurevich,May 31, 2006, p-24High QuadwordN+863 0Low QuadwordDoubleQuadwordMicrocomputersKonstantin Levit-Gurevich 24N

Memory Layout (1)●Little Endian -LSB in loweraddressesD7H12H7AHFHEHDHWord at Address BHContains FE06HFEH06HCHBHDoubleword at Address AHContains 7AFE0636H36HAHByte at Address 9HContains 1FH1FHA4H9H8HQuadword at Address 6HContains 7AFE06361FA4230BHWord at Address 6HContains 230BH23H0BH7H6H45H5H67H4HWord at Address 2HContains 74CBHWord at Address 1HContains CB31H74HCBH31H3H2H1HDouble Quadword at Address 0HContainsD77AFE06361FA4230B456774CB3112H12H0HDr. K. Levit-Gurevich,May 31, 2006, p-25MicrocomputersKonstantin Levit-Gurevich 25

Memory Layout (2)Dr. K. Levit-Gurevich,May 31, 2006, p-26MicrocomputersKonstantin Levit-Gurevich 26

Numeric Data Types - Integers0 – 25570Byte Unsigned Integer0 – 65,535150Word Unsigned Integer0 – 4,294,967,295310Doubleword Unsigned Integer0 – 2 64 - 1-128 – +127Dr. K. Levit-Gurevich,May 31, 2006, p-2763-32,768 – +32,767-2 31 – +2 31 - 1-2 63 – +2 63 - 1Sign63Sign31Sign15Sign7Quadword Unsigned IntegerByte Signed IntegerWord Signed IntegerDoubleword Signed IntegerQuadword Signed IntegerMicrocomputersKonstantin Levit-Gurevich 2700000

Numeric Data Types – Floating PointApproximate Normalized Range: 1.18 × 10 -38 to 3.40 × 10 38Sign31 3022 210Single PrecisionFloating PointApproximate Normalized Range: 2.23 × 10 -308 to 1.79 × 10 308Sign63 6252 510Double PrecisionFloating PointApproximate Normalized Range: 3.37 × 10 -4932 to 1.18 × 10 4932SignInteger bit79 7863 62Dr. K. Levit-Gurevich,May 31, 2006, p-28Double Extended PrecisionFloating PointMicrocomputersKonstantin Levit-Gurevich 280

Pointer Data TypesNear PointerOffset310Far Pointer or Logical AddressSegment SelectorOffset4732310Dr. K. Levit-Gurevich,May 31, 2006, p-29MicrocomputersKonstantin Levit-Gurevich 29

Basic Registers31General-Purpose Registers16 158 7016-bit32-bitAccumulatorAHALAXEAXBaseBHBLBXEBXCountCHCLCXECXDataDHDLDXEDXSource IndexSIESIDestination IndexDIEDIBase PointerBPEBPStack PointerSPESP3131Special RegistersProgram Status and Control RegisterInstruction Pointer00EFLAGSEIPSegment Registers15 0Code SegmentData SegmentStack SegmentExtra SegmentCSDSSSESFSGSDr. K. Levit-Gurevich,May 31, 2006, p-30MicrocomputersKonstantin Levit-Gurevich 30

Flags31Reserved (set to 0)2221ID20VIP19VIF18AC17VM161514RF 0 N T13IOPL1211OF10DF9IF8TF7SF65432ZF 0 A F 0 P F110CF● Affected by ALUoperations, e.g.,:●●ADD,CMP● Used mainly forcontrol flow:●JCC● But also by otherinstructions●ID- Identification FlagVIP - Virtual Interrupt PendingVIF - Virtual Interrupt FlagAC - Alignment CheckVM - Virtual-8086 ModeRF - Resume FlagNT - Nested Task FlagIOPL - I/O Privilege LevelIFTFADC, RLC, LOOP, etc...- Interrupt Enable Flag- Trap FlagStatus FlagsEFLAGS RegisterNAMEOFSFZFAFPFCFDr. K. Levit-Gurevich,May 31, 2006, p-31PURPOSEOverflow FlagSign FlagZero FlagAuxiliary CarryParity FlagCarry FlagResult exceeds positive or negative limit of numbers rangeResult is negative (less then zero)Result is zeroCONDITIONS REPORTEDCarry out of bit position 3 (used for BCD)Low byte of result had even parity (even number of set bits)Carry out of most significant bit of resultMicrocomputersKonstantin Levit-Gurevich 31

Floating PointSign79786463Data Registers0FP7ExponentSignificandFP6FP5FP4FP3FP2FP1FP015ControlRegister047Last Instruction Pointer0StatusRegisterLast Data (Operand) PointerTagRegister10Opcode0Dr. K. Levit-Gurevich,May 31, 2006, p-32MicrocomputersKonstantin Levit-Gurevich 32

63MMXMM7MM6MM5MM4MM3MM2MM1MM0063Packed Byte Integers063Packed Word Integers063Packed Doubleword Integers0Dr. K. Levit-Gurevich,May 31, 2006, p-33MicrocomputersKonstantin Levit-Gurevich 33

127127Streaming SIMD ExtensionsXMM7XMM6XMM5XMM4XMM3XMM2XMM1XMM0Packed Single Precision Floating-Point96 9564 6332 3100MXCSR Control/Status Register320Dr. K. Levit-Gurevich,May 31, 2006, p-34MicrocomputersKonstantin Levit-Gurevich 34

●Related instructions:♦ PUSH/POP,PUSHF/POPF,PUSHA/POPA,CALL/RET,IRETLocal Variablesfor CallingProcedure●Used in interrupthandlingParametersPassed toCalledProcedureFrame BoundaryStackStack SegmentReturn InstructionPointerTop of StackBottom of Stack(Initial ESP Value)The Stack Can Be16 or 32 Bits WideThe EBP registeris typically set topoint to the returninstruction pointerEBP RegisterESP RegisterDr. K. Levit-Gurevich,May 31, 2006, p-35Pushes Move theTop of Stack toLower AddressesPops Move theTop of Stack toHigher AddressesMicrocomputersKonstantin Levit-Gurevich 35

Control Registers31109876543210Reserved (set to 0)PCEPGEMCEPAEPSEDETSDPVIVMECR4OSXMMEXCPT31OSFXSR121154320Page-Directory BasePCDPWTCR3310Page-Fault Linear AddressCR2310CR13130292819181716156543210PGCDNWAMWPNEETTSEMMPPECR0Dr. K. Levit-Gurevich,May 31, 2006, p-36MicrocomputersKonstantin Levit-Gurevich 36

Other Registers●Segment Registers♦ CS, SS, DS, ES, FS, GS●System Registers♦ GDTR, IDTR, LDTR●Task Register♦ TR●Debug Registers♦ DR0-DR7 – not discussed here●Model Specific Registers♦ MSR, MTRR – not discussed hereDr. K. Levit-Gurevich,May 31, 2006, p-37MicrocomputersKonstantin Levit-Gurevich 37

System Registers Example - LinuxDr. K. Levit-Gurevich,May 31, 2006, p-38MicrocomputersKonstantin Levit-Gurevich 38

Dr. K. Levit-Gurevich,May 31, 2006, p-39Memory Organization●Physical Memory♦ The memory that the processor addresses on its bus.♦ Each byte is assigned with the unique address, called a Physical Address.♦ Maximal size is 64 GB (in IA32).●Programs do not directly access the physical memory. Instead, theyaccess memory using any of three following memory models:♦ Real-Address ModeUses the memory model for the Intel 8086 processor.Uses a specific implementation of segmented memory in which the linearaddress space for the program and the operating system consists of an arrayof segments of 64 KB in size each.♦ Flat ModeMemory appears to the program as a single, continuous address space,called Linear Address Space.Code, data and a procedure stack are all contained in this address space.Size is 4 GB.♦ Segmented ModeMicrocomputersKonstantin Levit-Gurevich 39

Segmentation●Problem: use “large” memory with short word width♦ 16 bits can cover only 64KB●Solution♦ address space divided into “segments”●Logical address is segment:offset●Segmentation: mapping from Logical Address to Linear Address♦ Segment base = f(segment register)♦ offset < segment size: ≤ 64K (8086/286), ≤ 4G (386+)♦linear address = segment_base + offset●At any time. there are 4 (8086) or 6 (386 and on) active segment registers♦ CS: code, DS: data, SS: stack♦ ES: extra, FS, GS: new extra●Rough analogy - phone directory: region and a number withinDr. K. Levit-Gurevich,May 31, 2006, p-40MicrocomputersKonstantin Levit-Gurevich 40

SegmentationDIFFERENT LOGICAL SEGMENTSONE PHYSICAL ADDRESS SPACEGSFSESDSCSSSAn Unsegmented MemoryDIFFERENT LOGICAL SEGMENTSCSSSDSESFSGSDIFFERENT ADDRESS SPACEIN PHYSICAL MEMORYCODESEGMENTSTACKSEGMENTDATASEGMENTDATASEGMENTDATASEGMENTDATASEGMENTDr. K. Levit-Gurevich,May 31, 2006, p-41A Segmented MemoryMicrocomputersKonstantin Levit-Gurevich 41

System ArchitectureOverviewDr. K. Levit-Gurevich,May 31, 2006, p-42MicrocomputersKonstantin Levit-Gurevich 42

EFLAGS RegisterControl RegistersTask RegisterPhysical AddressLinear AddressSegment SelectorRegisterGlobal DescriptorTable (GDT)Task-StateSegment (TSS)Code, Data orStack SegmentTaskCodeDataStackInterruptVectorInterrupt DescriptorTable (IDT)Seg. DescriptorTSS DescriptorCurrentTSSInterrupt HandlerCodeStackInterrupt GateSeg. DescriptorTask GateTrap GateTSS DescriptorLDT DescriptorTask-StateSegment (TSS)TaskCodeDataStackIDTRCall-Gate SegmentSelectorGDTRLocal DescriptorTable (LDT)Seg. DescriptorCall GateCurrentTSSCurrentTSSException HandlerCodeStackProtected ProcedureCodeStackDr. K. Levit-Gurevich,May 31, 2006, p-43LDTRMicrocomputersKonstantin Levit-Gurevich 43

Linear Address SpaceLinear AddressDir Table OffsetLinear AddressPage DirectoryPage TablePagePage Dir EntryPhysical AddressPage Table EntryCR3Dr. K. Levit-Gurevich,May 31, 2006, p-44MicrocomputersKonstantin Levit-Gurevich 44

●Real-address ModeOperation Modes♦ Implements the programming environment of the Intel 8086 processor with afew extensions. Processor is placed in real-address mode following power-upor a reset.●Protected Mode♦ Native state of the processor. All instructions and architectural features areavailable. This is the recommended mode for applications and operatingsystems●Virtual-8086 mode♦ Sub-mode of the Protected mode. Allows to execute “real-address mode”8086 software in a protected, multi-tasking environment. (Not covered here)●System Management Mode (SMM)♦ This mode provides an operating system or executive with a transparentmechanism for implementing platform-specific functions such as powermanagement and system security. (Not covered here)Dr. K. Levit-Gurevich,May 31, 2006, p-45MicrocomputersKonstantin Levit-Gurevich 45

Operation ModesCR0.PE=0Real-AddressModeorResetCR0.PE=1Reset orRSMSMI#ResetProtectedModeSMI#RSMSystemManagementModeEFLAGS.VM=0EFLAGS.VM=1Virtual-8086ModeSMI#RSM●16 bit♦ Real Mode (8086 and up)♦ Protected mode (286 and up)♦ Virtual 8086 mode (386 and up)Dr. K. Levit-Gurevich,May 31, 2006, p-46● 32 bit●●Protected mode (386 and up)Paging enabled/disabled● 64 bit modes are not shown hereMicrocomputersKonstantin Levit-Gurevich 46

32-Bit vs. 16-Bit Address and Operand Sizes●The processor can be configured for 32-bit and 16-bit address andoperand sizes.●With 16-Bit address and operand sizes♦ The maximum segment offset is 0xFFFF♦ Operand sizes are typically 8-bits, and 16-bits●With 32-Bit address and operand sizes♦ The maximum linear address or segment offset is 0xFFFFFFFF♦ Operand sizes are typically 8-bits, and 32-bits●Logical Address (far pointer):♦ 16-bit addressing: 16-bit selector : 16-bit offset♦ 32-bit addressing: 16-bit selector : 32-bit offset●In Real-Address Mode the default address and operand sizes are 16-bit●In Protected Mode the default address and operand sizes are determinedby the attributes of the segment registers●Instruction prefixes allow temporary overrides of the default addressand/or operand sizes from within a programDr. K. Levit-Gurevich,May 31, 2006, p-47MicrocomputersKonstantin Levit-Gurevich 47

Instruction SetAddressing &Instruction EncodingDr. K. Levit-Gurevich,May 31, 2006, p-48MicrocomputersKonstantin Levit-Gurevich 48

Instruction Set Architecture#include int main(){printf("Hello World!\n");}return 0;Linux Stylepushl %ebpmovl %esp,%ebpsubl $0x4,%espmovl $0x8043210,(%esp,1)call 0x8060325 movl $0x0,%eaxleaveretWindows Stylepush ebpmov ebp,esppush 0042201ccall 00401060xor eax,eaxmov esp,ebppop ebpretDr. K. Levit-Gurevich,May 31, 2006, p-49MicrocomputersKonstantin Levit-Gurevich 49

Instruction Set Architecture●Data transfer●Common ALU operations●Control flow●String●FP, MMX and SSE instructions●OS support●I/O instructions●“Exotic” instructions (DAA, XLAT, etc.)Dr. K. Levit-Gurevich,May 31, 2006, p-50MicrocomputersKonstantin Levit-Gurevich 50

Dr. K. Levit-Gurevich,May 31, 2006, p-51Data Transfer Instructions●MOV Byte / Word / Dword, register to / from register / memory♦ Also Zero/Sign extended moves●STACK OPERATIONS♦ PUSH [SOURCE]SP = SP - 2 ; ( SS : SP ) = SOURCE ; (16-bit)ESP = ESP - 4 ; ( SS : ESP ) = SOURCE ; (32-bit)♦ POP [DEST]DEST = ( SS : SP ) ; SP = SP + 2; (16-bit)DEST = ( SS : ESP ) ; ESP = ESP + 4; (32-bit)♦ PUSHA / POPA, PUSHF / POPF●I/O instructions♦ IN (I/O port specified by 8-bit immediate or DX)EAX / AX / AL = Dword / Word / Byte from I/O port♦ OUT (I/O port specified by 8-bit immediate or DX)I/O Port = Dword / Word / Byte from EAX / AX / ALMicrocomputersKonstantin Levit-Gurevich 51

ALU and Shift Instructions●ALU♦ Usual ALU ops on Byte / Word / Dword operands♦ Integer Multiply and Divide InstructionsMultiply and Divide use implicit register pairs♦ Chained add / subtract for multi-precision arithmetic♦ Conversion ops for packed and unpacked decimal math●SHIFTS and ROTATES♦ Logical and arithmetic shifts♦ Rotates through / not through C bit♦ By constant (1) or by count in CLLater extended to immediate operand in 80386Dr. K. Levit-Gurevich,May 31, 2006, p-52MicrocomputersKonstantin Levit-Gurevich 52

Dr. K. Levit-Gurevich,May 31, 2006, p-53Control Flow●Assortment of conditional jumps♦ Jcc Byte/Word/Dword displacement♦ CS not modified, EIP = EIP + signed displacement●JMP♦ SHORT - one byte displacement♦ NEAR - two/four byte displacement♦ FAR - Across Segments (Loads CS : EIP)●CALL♦ Like JMP♦ Pushes next EIP in SHORT and NEAR forms♦ Pushes CS : EIP in FAR forms●RET (Return)♦ Pops into EIP in NEAR form♦ Pops into CS : EIP in FAR form♦ RET immediate form – pops N bytes from the stackMicrocomputersKonstantin Levit-Gurevich 53

MOV, ALU, JMP Exampleint array[100];int sum = 0;void sum_array(){int i;for(i = 0; i < 100; i++){sum += array[i];}}COMM _array:DWORD:064H_sum DD 00H DUP (?)mov eax, OFFSET FLAT:_array$L28: mov ecx, DWORD PTR [eax]add eax, 4add DWORD PTR _sum, ecxcmp eax, OFFSET FLAT:_array+400jl SHORT $L28Dr. K. Levit-Gurevich,May 31, 2006, p-54MicrocomputersKonstantin Levit-Gurevich 54

String Instructions●String operation components♦ Opcode specifying ES:EDI op DS:ESI♦ Typically preceded by REP (Repeat) Prefixfor (;Repeat Condition; ECX--){ES : EDI op DS : ESI;EDI = EDI + d*operand_size;ESI = ESI + d*operand_size;}●d = 1 or -1♦ Value selected by Direction Flag EFLAGS.DF●INS, OUTS, MOVS, LODS, STOS, CMPS, SCAS.♦ Operand size is one of 8/16/32●Repeat Condition is test of Zero Flag♦ Can be test if set or test if clear♦ Zero Flag set by ECX == 0 or by op (CMPS)Dr. K. Levit-Gurevich,May 31, 2006, p-55MicrocomputersKonstantin Levit-Gurevich 55

Repeat MOVS Examplestruct str{char c[1000];} src, dst;void move(){dst = src;}Save registerson stackCopyRestore registersfrom stackReturn from functionCOMMCOMM_src:BYTE:03e8H_dst:BYTE:03e8H...push ebpmov ebp, esppush esipush edipush ecxmov ecx, 250mov esi, OFFSET FLAT:_srcmov edi, OFFSET FLAT:_dstrep movsdpop ecxpop edipop esipop ebpretDr. K. Levit-Gurevich,May 31, 2006, p-56MicrocomputersKonstantin Levit-Gurevich 56

Assembly Version●MS/Intel♦ Case insensitive♦ Operand size implicit♦ Destination on left“programming tool”●Unix♦ Case sensitive♦ Operand size explicit♦ Destination on right“compiler back-end”●Assembler should be smart enough to handle prefix when 16 bitoperand is used in 32 bit code...Dr. K. Levit-Gurevich,May 31, 2006, p-57MicrocomputersKonstantin Levit-Gurevich 57

Addressing ModesDr. K. Levit-Gurevich,May 31, 2006, p-58MicrocomputersKonstantin Levit-Gurevich 58

Effective Address ComputationBase Index Scale DisplacementEAXEBXECXEDXESPEBPESIEDIEAXEBXECXEDXEBPESIEDI*1248+ +None8-bit16-bit32-bitOffset = Base + (Index * Scale) + DisplacementDr. K. Levit-Gurevich,May 31, 2006, p-59MicrocomputersKonstantin Levit-Gurevich 59

Addressing Mode Examples (1)●Displacement♦ Direct (uncomputed) offset to the operandint a, b;b = a;●Basemov eax, DWORD PTR _amov DWORD PTR _b, eax♦ Indirect offset to the operandint a, *p;a = *p;mov eax, DWORD PTR _pmov ebx, DWORD PTR [eax]mov DWORD PTR _a, ebxDr. K. Levit-Gurevich,May 31, 2006, p-60MicrocomputersKonstantin Levit-Gurevich 60

Addressing Mode Examples (2)●Base + Displacement♦ Indexing into an array when the element size is not 2, 4, or 8 bytes.♦ Work with stackvoid func(int a){int b = a;}movmovecx, DWORD PTR [ebp+8]DWORD PTR [ebp-4], ecx♦ Accessing a field of a recordstruct {int x, y;} *str_p;int z = str_p->y;Dr. K. Levit-Gurevich,May 31, 2006, p-61movmovedx, DWORD PTR _str_peax, DWORD PTR [edx+4]MicrocomputersKonstantin Levit-Gurevich 61

Addressing Mode Examples (3)●(Index * Scale) + Displacement♦Efficient indexing into array when the element size is 2, 4, or 8bytes.int array[100];int c, i;c = array[i];movmovmovedx, DWORD PTR _ieax, DWORD PTR _array[edx*4]DWORD PTR _c, eaxDr. K. Levit-Gurevich,May 31, 2006, p-62MicrocomputersKonstantin Levit-Gurevich 62

Addressing Mode Examples (4)●Base + (Index * Scale) + Displacement♦ Access to several instances of an array of records♦ Efficient indexing of a two-dimensional array when the elements of thearray are 2, 4, or 8 bytes in size.struct {int x, y;} arr_str[100];int d, i;d = arr_str[i].y;movmovmovmovecx, DWORD PTR _iebx, DWORD PTR _arr_stredx, DWORD PTR [ebx+ecx*8+4]DWORD PTR _d, edxDr. K. Levit-Gurevich,May 31, 2006, p-63MicrocomputersKonstantin Levit-Gurevich 63

Instruction EncodingDr. K. Levit-Gurevich,May 31, 2006, p-64MicrocomputersKonstantin Levit-Gurevich 64

●Variable length instructions!Dr. K. Levit-Gurevich,May 31, 2006, p-65Instruction Format●Remember - only 2 explicit operands - source and destination may be thesame●Machine language♦ Zero or more prefixes♦ Opcode♦ Mod/Rm♦ Displacement♦ Immediate●8/16/32 Handling♦ Instruction encoding differs for 8-bit and 16/32 bit operands♦ Same instruction encoding is used for 16 and 32 bit operands♦ Current code segment defines if processor runs 16 or 32 bit code♦ In 16 bit code “word” operands indicate 16 bit operand,In 32 bit code “word” operands indicate 32 bit operand.♦ This can be toggled by the operand-size prefix (0x66)MicrocomputersKonstantin Levit-Gurevich 65

Instruction FormatInstructionPrefixesOpcode ModR/M SIB Displacement ImmediateUp to fourprefixes of1-byte each(optional)1-, 2-, or 3-byteopcode1 byte(if required)1 byte(if required)Addressdisplacementof 1, 2, or 4bytes or noneImmediatedata of1, 2, or 4bytes or none765320765320ModReg/OpcodeR/MScaleIndexBaseMinimal instruction size = 1 byteMaximal instruction size = 15 bytesDr. K. Levit-Gurevich,May 31, 2006, p-66MicrocomputersKonstantin Levit-Gurevich 66

Instruction Format ExampleOpcode ModRM SIB Displacement Immediate00000000 :0: 55 pushl %ebp1: 89 e5 movl %esp,%ebp3: 83 ec 08 subl $0x8,%esp6: 83 e4 f0 andl $0xfffffff0,%esp9: b8 00 00 00 00 movl $0x0,%eaxe: 29 c4 subl %eax,%esp10: c7 04 24 00 00 00 00 movl $0x0,(%esp,1)17: e8 fc ff ff ff call 18 1c: b8 00 00 00 00 movl $0x0,%eax21: c9 leave22: c3 ret00000000 :0: 48 65 6c 6c 6f 20 57 6f 72 6c 76 64 21 0aDr. K. Levit-Gurevich,May 31, 2006, p-67MicrocomputersKonstantin Levit-Gurevich 67

Instruction Encoding Example (1)f3 f0 64 66 67 ff 01REPZprefixLOCKprefixFSsegmentoverrideprefixOperandsizeoverrideprefix32->16Addresssizeoverrideprefix32->16OpcodeofINCModRMrepz lock incw %fs:(%bx,%di,1)Dr. K. Levit-Gurevich,May 31, 2006, p-68MicrocomputersKonstantin Levit-Gurevich 68

Instruction Encoding Example (2)8a 04 10MOV bytefrom r/m8 tor8ModR/MRegister: ALMemory: SIBSIBScale: 1Index: EDXBase: EAXmovb(%eax,%edx,1),%alDr. K. Levit-Gurevich,May 31, 2006, p-69MicrocomputersKonstantin Levit-Gurevich 69

Konstantin Levit-Gurevich 70MicrocomputersDr. K. Levit-Gurevich,May 31, 2006, p-700100Mod[EAX]+disp8[ECX]+disp8[EDX]+disp8[EBX]+disp8[--][--]+disp8[EBP]+disp8[ESI]+disp8[EDI]+disp8[EAX][ECX][EDX][EBX][--][--]disp32[ESI][EDI]Effective Addressr8(/r)r16(/r)r32(/r)mm(/r)xmm(/r)/digit (Opcode)REG =78797A7B7C7D7E7F707172737475767768696A6B6C6D6E6F606162636465666758595A5B5C5D5E5F505152535455565748494A4B4C4D4E4F404142434445464700000101001110010111011138393A3B3C3D3E3F303132333435363728292A2B2C2D2E2F202122232425262718191A1B1C1D1E1F101112131415161708090A0B0C0D0E0F0001020304050607000001010011100101110111Value of ModR/M Byte (in Hexadecimal)R/MBHDIEDIMM7XMM77111DHSIESIMM6XMM66110CHBPEBPMM5XMM55101AHSPESPMM4XMM44100BLBXEBXMM3XMM33011DLDXEDXMM2XMM22010CLCXECXMM1XMM11001ALAXEAXMM0XMM00000MOD/RM Table: 32-bit addressing table (1)

Konstantin Levit-Gurevich 71MicrocomputersDr. K. Levit-Gurevich,May 31, 2006, p-711110ModEAX/AX/AL/MM0/XMM0ECX/CX/CL/MM1/XMM1EDX/DX/DL/MM2/XMM2EBX/BX/BL/MM3/XMM3ESP/SP/AH/MM4/XMM4EBP/BP/CH/MM5/XMM5ESI/SI/DH/MM6/XMM6EDI/DI/BH/MM7/XMM7[EAX]+disp32[ECX]+disp32[EDX]+disp32[EBX]+disp32[--][--]+disp32[EBP]+disp32[ESI]+disp32[EDI]+disp32Effective Addressr8(/r)r16(/r)r32(/r)mm(/r)xmm(/r)/digit (Opcode)REG =F8F9FAFBFCFDFEFFF0F1F2F3F4F5F6F7E8E9EAEBECEDEEEFE0E1E2E3E4E5E6E7D8D9DADBDCDDDEDFD0D1D2D3D4D5D6D7C8C9CACBCCCDCECFC0C1C2C3C4C5C6C7000001010011100101110111B8B9BABBBCBDBEBFB0B1B2B3B4B5B6B7A8A9AAABACADAEAFA0A1A2A3A4A5A6A798999A9B9C9D9E9F909192939495969788898A8B8C8D8E8F8081828384858687000001010011100101110111Value of ModR/M Byte (in Hexadecimal)R/MBHDIEDIMM7XMM77111DHSIESIMM6XMM66110CHBPEBPMM5XMM55101AHSPESPMM4XMM44100BLBXEBXMM3XMM33011DLDXEDXMM2XMM22010CLCXECXMM1XMM11001ALAXEAXMM0XMM00000MOD/RM Table: 32-bit addressing table (2)

SIB Table: 32-bit addressing table (1)r32Base =Base =EAX0000ECX1001EDX2010EBX3011ESP4100[*]5101ESI6110EDI7111Scaled IndexSSIndexValue of SIB Byte (in Hexadecimal)[EAX][ECX][EDX][EBX]none[EBP][ESI][EDI]0000000101001110010111011100081018202830380109111921293139020A121A222A323A030B131B232B333B040C141C242C343C050D151D252D353D060E161E262E363E070F171F272F373F[EAX*2][ECX*2][EDX*2][EBX*2]none[EBP*2][ESI*2][EDI*2]0100000101001110010111011140485058606870784149515961697179424A525A626A727A434B535B636B737B444C545C646C747C454D555D656D757D464E565E666E767E474F575F676F777FDr. K. Levit-Gurevich,May 31, 2006, p-72MicrocomputersKonstantin Levit-Gurevich 72

SIB Table: 32-bit addressing table (2)r32Base =Base =EAX0000ECX1001EDX2010EBX3011ESP4100[*]5101ESI6110EDI7111Scaled IndexSSIndexValue of SIB Byte (in Hexadecimal)[EAX*4][ECX*4][EDX*4][EBX*4]none[EBP*4][ESI*4][EDI*4]1000000101001110010111011180889098A0A8B0B881899199A1A9B1B9828A929AA2AAB2BA838B939BA3ABB3BB848C949CA4ACB4BC858D959DA5ADB5BD868E969EA6AEB6BE878F979FA7AFB7BF[EAX*8][ECX*8][EDX*8][EBX*8]none[EBP*8][ESI*8][EDI*8]11000001010011100101110111C0C8D0D8E0E8F0F8C1C9D1D9E1E9F1F9C2CAD2DAE2EAF2FAC3CBD3DBE3EBF3FBC4CCD4DCE4ECF4FCC5CDD5DDE5EDF5FDC6CED6DEE6EEF6FEC7CFD7DFE7EFF7FFDr. K. Levit-Gurevich,May 31, 2006, p-73MicrocomputersKonstantin Levit-Gurevich 73

Instruction Encoding Example (3)MOVEAX, [EBP + 0x4] – move the content of memory located at[EBP] + 0x4 to register EAX8B /r MOV r32, r/m32 (Move r/m32 to r32)r32 = EAX REG = 0Effective Address = [EBP]+disp8Mod = 01 R/M = 101 ModRM = 0x450x8B 0x45 0x4MOV EAX, [ESP + 0x4] – move the content of memory located at[ESP] + 0x4 to register EAXr32 = EAX REG = 0Effective Address = [ESP]+disp8 Mod = 01 R/M = 100 ModRM = 0x44SIB required Base (ESP) = 4, Index = none SIB = 0x24, 0x64, 0xA4, 0xE40x8B 0x44 0x24 0x04 0x8B 0x44 0x64 0x040x8B 0x44 0xA4 0x040x8B 0x44 0xE4 0x04Dr. K. Levit-Gurevich,May 31, 2006, p-74MicrocomputersKonstantin Levit-Gurevich 74

Instruction Encoding Example (4)MOVEBX, EAX – move the content of register EAX to register EBXTwo possibilities:89 /r MOV r/m32, r32 8B /r MOV r32, r/m32(Move r32 to r/m32) (Move r/m32 to r32)EBX corresponds to r/m32operandEAX corresponds to r32operandEBX corresponds to r32 operandEAX corresponds to r/m32 operandModRM = 0xC3ModRM = 0xD80x89 0xC30x8B 0xD8Dr. K. Levit-Gurevich,May 31, 2006, p-75MicrocomputersKonstantin Levit-Gurevich 75

Protected-Mode MemoryManagementDr. K. Levit-Gurevich,May 31, 2006, p-76Konstantin Levit-GurevichMicrocomputers

Overview●Memory Management in IA-32♦Segmentation♦PagingProvides a mechanism of isolating individual code, data, andstack modules so that multiple programs (or tasks) can runon the same processor without interfering with one another.Always enabledProvides a mechanism for implementing a conventionaldemand-paged, virtual-memory system where sections of aprogram’s execution environment are mapped into physicalmemory as needed.Can be enabled/disabledDr. K. Levit-Gurevich,May 31, 2006, p-77Konstantin Levit-GurevichMicrocomputers

SegmentSelectorLogical address(or Far Pointer)OffsetLinear AddressSpaceBig PictureGlobal DescriptorTable (GDT)Linear AddressDir Table OffsetSegmentDescriptorSegmentPageDirectoryPage TablePhysical AddressSpacePageLin. Addr.EntryPhys. Addr.EntrySegmentBase AddressPageSegmentationPagingDr. K. Levit-Gurevich,May 31, 2006, p-78MicrocomputersKonstantin Levit-Gurevich 78

SegmentationDr. K. Levit-Gurevich,May 31, 2006, p-79Konstantin Levit-GurevichMicrocomputers

Segmentation●“Linear Address Space” is divided into “segments” holding♦ Code, data and stack for programs♦ System data structures (TSS, LDT)●Logical Address (or far pointer) = Segment : Offset●Segment Selector♦ Provides an offset into a descriptor table to a data structure called a segmentdescriptor●Segment Descriptor♦ Specifies the size, access rights, privilege level, type and base address.●Segmentation♦ Mapping from Logical Address to Linear Address♦ Linear Address = Segment_Base + Offset●6 active segment registers♦ CS: code, DS: data, SS: stack, ES: extra. FS, GS: new extraDr. K. Levit-Gurevich,May 31, 2006, p-80Konstantin Levit-GurevichMicrocomputers

Basic Flat ModelSegmentRegistersCSLinear Address Space(or Physical Memory)CodeSSDSESCode- and Data- SegmentDescriptorsAccessAccessBase AddressBase AddressLimitLimitNot PresentData andStackFSGS●Simplest memory model●Operating System and application programs have access to a continuous,unsegmented address spaceDr. K. Levit-Gurevich,May 31, 2006, p-81MicrocomputersKonstantin Levit-Gurevich 81

SegmentRegistersCSSSProtected Flat ModelSegment DescriptorsAccess LimitBase AddressLinear Address Space(or Physical Memory)CodeNot PresentMemory I/ODSAccessLimitData andStackESBase AddressFSGS●Segment limits are set to include only the range of addresses for whichphysical memory actually exists.●Minimal level of hardware protection against some kinds of program bugs.Dr. K. Levit-Gurevich,May 31, 2006, p-82MicrocomputersKonstantin Levit-Gurevich 82

A General Segmentation ModelSegmentRegistersCSSSSegment DescriptorsAccessLimitBase AddressAccessLimitBase AddressLinear Address Space(or Physical Memory)StackDSESFSGSAccess LimitBase AddressAccess LimitBase AddressAccess LimitBase AddressCodeDataDataData●Uses the full capabilities of thesegmentation mechanism toprovide hardware enforcedprotection of code, datastructures, and program andtasks.Dr. K. Levit-Gurevich,May 31, 2006, p-83AccessBase AddressAccessBase AddressAccessLimitLimitLimitBase AddressDataMicrocomputersKonstantin Levit-Gurevich 83

Linear Address Computation● Real ModeSegment Base = segment*16Linear address = segment*16 + offsetLast address:0xFFFF*16+0xFFFF = 0x10FFEF(1M+64K-16)Base+Offsett=Linear AddressLogicalAddress19 4 3 2 1 016-Bit Segment Selector 0 0 0 019 16 1500 0 0 0 16-Bit Effective Address20 19021-Bit Linear AddressA2015 0310Seg.SelectorOffset● Protected ModeLinear address =Descriptor_Table[f(segment)].base +offsetCoverage4GB address space. 1 segment is sufficientDescriptor TableSegmentDescriptorBase Address+31Linear Address0Dr. K. Levit-Gurevich,May 31, 2006, p-84Konstantin Levit-GurevichMicrocomputers

Segment RegistersVisible PartHidden PartSegment Selector Base Address, Limit, Access Information CSSSDSESFSGS● Code-segment (CS), data-segment (DS), and stack-segment (SS) registers mustbe loaded with valid segment selectors● ES, FS, and GS can be used to make additional data segments available to task.● When a segment selector is loaded into the visible part of a segment register, theprocessor also loads the hidden part of the segment register.● Instructions:♦ Direct load: MOV SR, POP SR, LDS, LES, LSS, LGS, and LFS instructions.♦ Implied load: far CALL, JMP, and RET; SYSENTER and SYSEXIT; IRET, INTn,INTO and INT3 instructions.Dr. K. Levit-Gurevich,May 31, 2006, p-85Konstantin Levit-GurevichMicrocomputers

Segment Selector Format153210IndexTIRPLTable Indicator0 = GDT1 = LDTRequested Privilege Level (RPL)●13 bit index (8K entries)●1 bit table index - GDT (0) or LDT (1)●2 bit “requested” privilege level – 0, 1, 2, 3Dr. K. Levit-Gurevich,May 31, 2006, p-86Konstantin Levit-GurevichMicrocomputers

Global and Local Descriptor TablesGlobalDescriptorTable (GDT)LocalDescriptorTable (LDT)TISegmentSelectorTI = 0TI = 156564848● Global Descriptor Table – GDT4040●Pointed by GDTR3232● Local Descriptor Table - LDT●●Per taskPointed indirectly by LDTR● Table pointers24168First Descriptor inGDT is Not Used 0241680●●GDTR: Base + LimitLDTR: GDT Index onlyGDTR RegisterLimitBase AddressLDTR RegisterLimitBase AddressSeg.Sel.Dr. K. Levit-Gurevich,May 31, 2006, p-87MicrocomputersKonstantin Levit-Gurevich 87

Segment Descriptor Format31242322212019161514131211870D A Seg. DBase 31:24 G / 0 V Limit P P S TypeBase 23:16B L 19:16 L43116150Base Address 15:00 Segment Limit 15:000AVLBASED/BDPLGLIMITPSTYPE– Available for use by system software– Segment base address– Default operation size (0 = 16-bit segment; 1 = 32-bit segment)– Descriptor privilege level– Granularity– Segment limit– Segment present– Descriptor type (0 = system; 1 = code or data)– Segment typeDr. K. Levit-Gurevich,May 31, 2006, p-88Konstantin Levit-GurevichMicrocomputers

Code- and Data-Segment TypesType FieldDecimal1110E9W8ADescriptorTypeDescription00000DataRead-Only10001DataRead-Only, accessed20010DataRead/Write30011DataRead/Write, accessed40100DataRead-Only, expand-down50101DataRead-Only, expand-down, accessed60110DataRead/Write, expand-down70111DataRead/Write, expand-down, accessedCRA81000CodeExecute-Only91001CodeExecute-Only, accessed101010CodeExecute/Read111011CodeExecute/Read, accessed121100CodeExecute-Only, conforming131101CodeExecute-Only, conforming, accessed141110CodeExecute/Read, conforming151111CodeExecute/Read, conforming, accessedDr. K. Levit-Gurevich,May 31, 2006, p-89MicrocomputersKonstantin Levit-Gurevich 89

System Descriptor TypesType FieldDecimal111098Description00000Reserved1000116-Bit TSS (Available)20010LDT3001116-Bit TSS (Busy)4010016-Bit Call Gate50101Task Gate6011016-Bit Interrupt Gate7011116-Bit Trap Gate81000Reserved9100132-Bit TSS (Available)101010Reserved11101132-Bit TSS (Busy)12110032-Bit Call Gate131101Reserved14111032-Bit Interrupt Gate15111132-Bit Trap GateDr. K. Levit-Gurevich,May 31, 2006, p-90MicrocomputersKonstantin Levit-Gurevich 90

GDT and LDT in LinuxDr. K. Levit-Gurevich,May 31, 2006, p-91MicrocomputersKonstantin Levit-Gurevich 91

GDT in Windows NTGDTbase = 80036000 Limit = 03FFSel. Type Base Limit DPL P Attr.0008 Code32 00000000 FFFFFFFF 0 1 RE0010 Data32 00000000 FFFFFFFF 0 1 RW001B Code32 00000000 FFFFFFFF 3 1 RE0023 Data32 00000000 FFFFFFFF 3 1 RW0028 TSS32 8000B000 000020AB 0 1 B0030 Data32 FFDFF000 00001FFF 0 1 RW003B Data32 7FFDE000 00000FFF 3 1 RW0043 Data16 00000400 0000FFFF 3 1 RW0048 LDT E156C000 0000FFEF 0 10050 TSS32 80143FE0 00000068 0 10058 TSS32 80144048 00000068 0 1User Thread Environment BlockKernel Processor Control RegionDr. K. Levit-Gurevich,May 31, 2006, p-92MicrocomputersKonstantin Levit-Gurevich 92

GDT/LDT Manipulation Example (1)0x0100: 0008002F 00000000 00000000 000000000x0110: 0000FFFF 00CF9A00 0000FFFF 00CF92000x0120: 0000FFFF 00CFFA00 0000FFFF 00CFF2000x0130: 0138000F 00408200 F0001FFF FF4092DF0x0140: E0000FFF 7F40F2FDAssume Protected 16-bit mode here0x1028: mov eax, 0x00x102D: lgdt fword ptr [eax] Load GDT0x1030: jmp 0x08:0x00001037 Load new CS0x1037: mov eax, 0x10 Protected 32 mode0x103C: mov ds, ax Load data and stack0x103F: mov ss, ax segments from GDT0x1042: mov eax, 0x280x1047: lldt ax Load LDT0x104A: mov eax, 0x7 Load data segments0x104F: mov fs, ax from LDT0x1052: mov eax, 0x0F0x1057: mov gs, axDr. K. Levit-Gurevich,May 31, 2006, p-93MicrocomputersKonstantin Levit-Gurevich 93

GDT/LDT Manipulation Example (2)Exercise 1Assuming working in flat data segment write the following system call in Cprogramming language●void print_gdt()♦ This function presents the unpacked context of all present segment descriptorsof the current GDTTo present a segment descriptor context you can use the following function:void show_descriptor(__int32 index,__int32 base,__int32 size,__int32 type,__int32 dpl,bool system,bool db)Dr. K. Levit-Gurevich,May 31, 2006, p-94MicrocomputersKonstantin Levit-Gurevich 94

typedef struct GDTR_s {__int64 limit : 16,base : 32,res : 16;} GDTR_t;typedef struct Segment_Descriptor_s {__int64 limit_15_00 : 16,base_15_00 : 16,base_23_16 : 8,type : 4,system : 1,dpl : 2,present : 1,limit_19_16 : 4,avl : 1,res : 1,db : 1,granularity : 1,base_31_24 : 4;} Segment_Descriptor_t;Dr. K. Levit-Gurevich,May 31, 2006, p-95MicrocomputersKonstantin Levit-Gurevich 95

void print_gdt(){unsignednumber_of_entries, i, base, size;Segment_Descriptor_t* gdt;GDTR_tgdtr;// Get GDTR__asm{sgdt gdtr}// Calculate the number of descriptors within GDTnumber_of_entries = (gdtr.limit + 1) >> 3;// Assign the base pointergdt = (Segment_Descriptor_t*)gdtr.base;Dr. K. Levit-Gurevich,May 31, 2006, p-96MicrocomputersKonstantin Levit-Gurevich 96

}// Go through all descriptorsfor (i = 0; i < number_of_entries; i++) {// Proceed if presentif (gdt[i].present) {// Calculate segment basebase = (gdt[i].base_31_24

GDT/LDT Manipulation ExampleExercise 2 - HomeworkAssuming working in flat data segment write the following system call in Cprogramming language●void print_ldt()♦ This function presents the unpacked context of all present segment descriptorsof the current LDT. Assume LDT exists if LDTR != 0.To present a segment descriptor context you can use the following function:void show_descriptor(__int32 index,__int32 base,__int32 size,__int32 type,__int32 dpl,bool system,bool db)Dr. K. Levit-Gurevich,May 31, 2006, p-98MicrocomputersKonstantin Levit-Gurevich 98

Dr. K. Levit-Gurevich,May 31, 2006, p-99Task State Segment (TSS)● HW mechanism to allow multi-tasking● Contains task state:♦ Dynamic: Integer registers, segment registers,flags...♦ Static: CR3, per level stack pointer, etc...● Task switch occurs on♦ JMP/Call to a TSS descriptor or task gate♦ Interrupt indexed to a task gate♦ Certain IRET● On switch:♦ Current TSS content is updatedFor example, general registers are savedby the CPU!♦ New task’s values are retrieved from TSS● Contains initial stack values for level 0-2● System should have at least one TSS.● Task Register (TR) is loaded by LTRinstruction, or during a task switch.31Konstantin Levit-GurevichI/O Map Base AddressEDI15ESIEBPESPEBXEDXECXEAXEFLAGSEIPCR3 (PDBR)ESP2ESP1ESP0DSDSSSCSESSS2SS1SS0Previous Task LinkMicrocomputers0TLDT Segment SelectorGS10096928884807672686460565248444036322824201612840

Task LinkingTop LevelTaskCurrently Nested ExecutingTaskCurrently More Deeply ExecutingNested Task TaskCurrently ExecutingTaskTSSEFLAGS TSSEFLAGS TSSEFLAGSNT = 0NT = 1NT = 1NT = 0NT = 1NT = 1PreviousTask LinkPreviousTask RegisterTask LinkPreviousTask RegisterTask LinkTask Register● Upon CALL instruction, an interrupt, or an exception causing a task switch♦ CPU copies the segment selector of the current TSS into the previous task link field of the TSS forthe new task and sets EFLAGS.NT = 1● Upon IRET instruction suspending the new task:♦ CPU uses the value in the previous task link field and the NT flag to return to the previous task.Dr. K. Levit-Gurevich,May 31, 2006, p-100MicrocomputersKonstantin Levit-Gurevich 100

PagingDr. K. Levit-Gurevich,May 31, 2006, p-101Konstantin Levit-GurevichMicrocomputers

Motivation●The problem:Dr. K. Levit-Gurevich,May 31, 2006, p-102♦ Register can cover 4GB, but actual memory is smaller.♦ Application should be independent of actual memory locationavoid memory size and location concernsallow using fixed, pre-defined addressesallow easier code/data sharing●Solution: Paging♦ memory is divided to equal sized chunks - pages (4KB)♦ A layer on top of segmentation. Performs linear to physical mapping.♦ Address “processing”Logical (Seg:Offset) ==> Linear (segment base+offset) ==> Physical♦ Page tables map linear address to physical. Keeping track of:Valid pages, not existing pages, page attributes●Analogy:♦ Big library, good catalog, reading room, storage roomMicrocomputersKonstantin Levit-Gurevich 102

●A page can bePaging - Virtual Memory♦ New, not yet loaded♦ Loaded♦ Paged out to disk●A loaded page can be♦ Clean♦ Dirty●When a page is not loaded (P bit clear) => a page fault occurs♦ It may require removing a loaded page to insert the new oneOS prioritizes removal by LRU and dirty/clean/avail bitsDirty page should be written to disk. Clean ones need not♦ New page is either loaded from disk or “initialized”♦ CPU sets page “access” flag when accessed, “dirty” when written●Copy-on-write strategy for forked Unix processesDr. K. Levit-Gurevich,May 31, 2006, p-103Konstantin Levit-GurevichMicrocomputers

Paging Related Bits in Control Registers31109876543210Reserved (set to 0)PCEPGEMCEPAEPSEDETSDPVIVMECR4OSXMMEXCPT31OSFXSR121154320Page-Directory BasePCDPWTCR3310Page-Fault Linear AddressCR2310CR13130292819181716156543210PGCDNWAMWPNEETTSEMMPPECR0Dr. K. Levit-Gurevich,May 31, 2006, p-104MicrocomputersKonstantin Levit-Gurevich 104

Paging options●Paging is controlled by three flags in CPU’s control registers:♦CR0.PG - pagingenables the page translation mechanism when set.can be set if CR0.PE flag is set♦CR4.PSE - page size extensionenables large page sizes when set4MB♦CR4.PAE - physical address extensionenables 36-bit physical addresses.can only be used when paging is enabled.Enables 2MB pagesDr. K. Levit-Gurevich,May 31, 2006, p-105MicrocomputersKonstantin Levit-Gurevich 105

Page Tables and Directories●Page Directory♦ an array of 1024 32-bit Page-Directory Entries (PDE) contained in 4KBpage.●Page Table♦ an array of 1024 32-bit Page-Table Entries (PTE) contained in 4KB page.●Page♦ a 4KB, 2MB or 4MB flat address space.●Page-Directory-Pointer Table♦ an array of four 64-bit entries, each of which points to a page directory.♦ only used when the physical address extension is enabled (CR4.PAE=1).Dr. K. Levit-Gurevich,May 31, 2006, p-106MicrocomputersKonstantin Levit-Gurevich 106

Page Translation Mechanism (4KB)●2-level hierarchical mapping♦ using page directory and page tables♦ all pages and page tables are 4KB●Linear address divided to:♦ Dir: 10 bits - index into page directory♦ Table: 10 bits - index into page table♦ Offset: 12 bits - offset in the page●CR3 points to current Page Directory♦ may be changed per process●Sharing♦ different linear to same physicalDr. K. Levit-Gurevich,May 31, 2006, p-107♦ pages from different processes to same physicalMicrocomputersKonstantin Levit-Gurevich 107

Page Translation Mechanism (4KB)Linear Address Space3122Linear AddressLinear Address2112 110Dir Table OffsetLinear AddressPage DirectoryPage TablePagePage Dir EntryPhysical AddressPage Table EntryCR3Dr. K. Levit-Gurevich,May 31, 2006, p-108MicrocomputersKonstantin Levit-Gurevich 108

Page-Directory Entry (4-KByte Page Table)3112119876543210Page-Table Base AddressAvail.GPS0APCDPWTU/SR/WPAvailable for system programmer’s useGlobal page (ignored)Page size (0 indicates 4 Kbytes)Reserved (set to 0)AccessedCache disabledWrite-throughUser/SupervisorRead/WritePresent31Page-Table Entry (4-KByte Page)12119876543210Page Base AddressAvailG0DAPCDPWTU/SR/WPDr. K. Levit-Gurevich,May 31, 2006, p-109Available for system programmer’s useGlobal pageReserved (set to 0)DirtyAccessedCache disabledWrite-throughUser/SupervisorRead/WritePresentMicrocomputersKonstantin Levit-Gurevich 109

Enabling Paging ExamplePhysical Memory at 0x1000:Physical Memory at 0x2000:0x00002003 0x000000000x00000003 0x00001003Virtual Address===============0x00000C34 mov eax, 1000h0x00000C3A mov cr3, eax Load CR30x00000C3E mov eax, cr00x00000C42 or eax, 80000000h0x00000C48 mov cr0, eax Enable paging0x00000C4C jmp 08h:00h Make sure thenext instructionis mapped 1-to-1Dr. K. Levit-Gurevich,May 31, 2006, p-110MicrocomputersKonstantin Levit-Gurevich 110

Page Walk ExampleExerciseAssume that the OperatingSystem operates in flatmemory model and mapsthe whole RAM 1-to-1starting from linear address0xC0000000.Assume also that OS uses both4KB and 4MB pages(CR4.PSE=1).RAMWrite the following kernelfunction in C:● unsigned translate_linear_to_physical(unsigned la)♦ Input: unsigned la – Linear Address0xC0000000♦ Output: unsigned – Corresponding Physical Address or0xFFFFFFFF if not mappedLinearAddressSpaceDr. K. Levit-Gurevich,May 31, 2006, p-111MicrocomputersKonstantin Levit-Gurevich 111

unsigned translate_linear_to_physical(unsigned la){unsigned cr3_val;unsigned pdir_index = la >> 22;unsigned ptable_index = (la & 0x003FFFFF) >> 12;unsigned *pdir, *ptable, pde, pte;// Get CR3__asm {mov eax, cr3mov cr3_val, eax}// Calculate linear address of the Page Directorypdir = (unsigned*)((cr3_val & 0xFFFFF000) + 0xC0000000);// Calculate the corresponding Page Directory Entrypde = pdir[pdir_index];// Check if Page Directory Entry is presentif (pde & 0x1 == 0x0) return 0xFFFFFFFF;Dr. K. Levit-Gurevich,May 31, 2006, p-112MicrocomputersKonstantin Levit-Gurevich 112

Check if the Page is 4MB sizeif (pde & 0x80 != 0x0){return (pde & 0xFFC00000) + (la & 0x003FFFFF);}// Calculate linear address of the Page Tableptable = (unsigned*)((pde & 0xFFFFF000) + 0xC0000000);// Calculate the corresponding Page Table Entrypte = ptable[ptable_index];// Check if Page Table Entry is presentif (pte & 0x1 == 0x0){return 0xFFFFFFFF;}}return (pte & 0xFFFFF000) + (la & 0x00000FFF);Dr. K. Levit-Gurevich,May 31, 2006, p-113MicrocomputersKonstantin Levit-Gurevich 113

Page Fault●#PF - Page Fault Exception occurs if♦ Present bit (P) of page-directory or page-table entries is clear♦ A privilege level violation (User / Supervisor) occurs♦ An access violation (Read-Only / Read-Write) occurs♦ Reserved bits are set●CPU loads CR2 with the faulting address●If the processor generates a page-fault exception due to not presentpage, the OS must carry out the following operations in thefollowing order:♦ Copy the page from disk storage into physical memory, if needed.♦ Load the page address into the page-table or page-directory entry and setits present flag. Other bits, such as the dirty and accessed flags, mayalso be set at this time.♦ Invalidate the current page-table entry in the TLB.♦ Return from the page-fault handler to restart the interrupted program ortask.Dr. K. Levit-Gurevich,May 31, 2006, p-114MicrocomputersKonstantin Levit-Gurevich 114

Page-Level Protection●Can be used alone (flat memory model)♦ allows supervisor (ring0,1,2) code and data to be protected from user(ring3) code and data♦ write protection●Can be applied to segments♦ CPU evaluates segment protection first♦ Page-level protection cannot be used to override segment-levelprotection.For example: code segment is by definition not writable. Setting itspages to Read-Write does not make the pages writable. Attempts towrite into the pages will be blocked by segment-level protectionchecks.Dr. K. Levit-Gurevich,May 31, 2006, p-115MicrocomputersKonstantin Levit-Gurevich 115

Mapping 4M pages,36-bit Physical Addresses●PSE (page size extension) flag bit 4 of CR4♦ The PSE enables 4MByte. Those pages are mapped directly from thepage directory, without using page tables.♦ If PDE.PS=1the page is considered to be 4MB. Linear to physicaladdress translation then is:PA = (PDE & 0xFFC00000) + (LA & 0x003FFFFF)●PAE (physical address extension) flag bit 5 of CR4♦ The PAE flag enables 36 bit physical addresses♦ If CR4.PAE=1 the architecture supports page sizes of 4KB and 2MBregardless of the value of CR4.PSE.Dr. K. Levit-Gurevich,May 31, 2006, p-116MicrocomputersKonstantin Levit-Gurevich 116

Directory PointerPage Translation Mechanism (4KB)with Physical Address ExtensionLinear Address3130 2921 2012 11Dir Table Offset09Page Directory9Page Table12Page2Page Dir EntryPhysical AddressPage Table EntryPage DirectoryPointer TableDir Pointer Entry4PDPTE * 512 PDE * 512 PTE = 2 20 pages32CR3 (PDBR)Dr. K. Levit-Gurevich,May 31, 2006, p-117MicrocomputersKonstantin Levit-Gurevich 117

63Page-Directory-Pointer-Table Entry363532Reserved (set to 0)BaseAddr.3112119876543210Page-Directory Base AddressAvailReservedPCDPWTRes1Page-Directory Entry (4-KByte Page Table)63363532Reserved (set to 0)BaseAddr.3112119876543210Page-Table Base AddressAvail000APCDPWTU/SR/WP63Page-Table Entry (4-KByte Page)363532Reserved (set to 0)BaseAddr.31Dr. K. Levit-Gurevich,May 31, 2006, p-118Page Base Address1211Avail98G70MicrocomputersKonstantin Levit-Gurevich 1186D5A4PCD3PWT2U/S1R/W0P

Enabling Physical Address Extension●Start from Paging Enabled and PhysicalAddress Extension disabled(CR0.PG=1, CR4.PAE=0)●Disable Paging(CR0.PG=0, CR4.PAE=0)PGPECR0●Enable physical address extension(CR0.PG=0, CR4.PAE=1)●Load CR3 with the physical address ofthe Page Directory Pointer TablePDPT Physical AddressPAECR3CR4●Enable Paging(CR0.PG=1, CR4.PAE=1)Dr. K. Levit-Gurevich,May 31, 2006, p-119MicrocomputersKonstantin Levit-Gurevich 119

Paging/Segmentation Example●GDTR: 0x00002000 (linear)●CR3: 0x00003000 (physical)●Logical address:♦ DS: 0x23♦ Offset: 0x3456●DS: 0010 0011 ==> index = 4, TI = GDT, PL = 3GDT(4) = GDTR + 4*8 = 0x2000 + 0x20 = 0x2020●Assume protection validation has passed… andGDT(4).base = 0x00c03000 (linear)●Linear address = 0x00c03000 + 0x3456 = 0x00c064560000 0000 1100 0000 0110 0100 0101 0110♦ PDE_index = 0x003♦ PTE_index = 0x006♦ Offset = 0x456Dr. K. Levit-Gurevich,May 31, 2006, p-120MicrocomputersKonstantin Levit-Gurevich 120

Paging/Segmentation Example (cont)●Physical address of PDE isPDE = CR3 + 4 * PDE_index = 0x3000 + 4 * 3 = 0x300cM(0x300c) = 0x4000●Physical address of PTE isPTE = *PDE + 4 * PTE_index = 0x4000 + 4 * 6 = 0x4018M(0x4018) = 0x5000●Physical address corresponding to linear address isPhysical Addr. = *PTE + offset = 0x5000 + 0x456 = 0x5456M(0x5456) = 0x12345678●Finished? Not exactly...Dr. K. Levit-Gurevich,May 31, 2006, p-121MicrocomputersKonstantin Levit-Gurevich 121

Paging/Segmentation Example (cont)●Where do we find the GDT?♦ GDTR points to 0x2000 linear, need to look at the physical address tosee GDT(4)!●Linear address = 0x00002020 =♦ PDE♦ PTE♦ Offset0000 0000 0000 0000 0010 0000 0010 0000= 0x000= 0x002= 0x020PDE = CR3 + 4 * PDE_index = 0x3000 + 4 * 0 = 0x3000M(0x3000) = 0x6000PTE = *PDE + 4 * PTE_index = 0x6000 + 4 * 2 = 0x6008M(0x6008) = 0x5000Physical Addr. = *PTE + offset = 0x5000 + 0x20 = 0x5020M(0x5020) = 0x0010f0c03000fedcDr. K. Levit-Gurevich,May 31, 2006, p-122MicrocomputersKonstantin Levit-Gurevich 122

Page Tables in LinuxDr. K. Levit-Gurevich,May 31, 2006, p-123MicrocomputersKonstantin Levit-Gurevich 123

Windows NT Paging Management●Reserves 4MB of the system addressspace for Page Tables:0xC0000000-0xC03FFFFF●Page Directory functions:♦ Page Directory for 4GB space♦ Page Table for pages representing thePage Tables●Page Directory virtual address:0x300Page Directory0xC03000001100000000 1100000000 0000000000000x300 0x300DIR TABLECR3Dr. K. Levit-Gurevich,May 31, 2006, p-124MicrocomputersKonstantin Levit-Gurevich 124

ExerciseDr. K. Levit-Gurevich,May 31, 2006, p-125Windows NT Paging ManagementAssuming working in Windows NT, write the following functions in “C”:●bool is_present(unsigned int la)♦ Input: unsigned int la – Linear Address♦ Output: TRUE if present, FALSE otherwise●void map_page(unsigned int la,unsigned int pa,bool is_4kb_page,unsigned attr)♦ Input: unsigned int la – Linear Address to be mappedunsigned int pa – Corresponding Physical Addressbool is_4kb_page – TRUE if 4KB, FALSE if 4MBunsigned attr – page attributes (R/W, U/S, etc)●unsigned int virtual_to_physical(unsigned int la)♦ Input: unsigned int la – Linear Address♦ Output: unsigned int – Corresponding Physical AddressMicrocomputersKonstantin Levit-Gurevich 125

ool is_present(unsigned la){unsigned pdir_index = la >> 22;unsigned* ptables = (unsigned*)0xC0000000;unsigned* pdir = (unsigned*)0xC0300000;// Check if Page Directory Entry is presentif (pdir[pdir_index] & 0x1 == 0x0) {return false;}// Check if 4MB pageif (pdir[pdir_index] & 0x80) {return true;}// Check if Page Table Entry is presentif (ptables[la >> 12] & 0x1) {return true;}}return false;Dr. K. Levit-Gurevich,May 31, 2006, p-126MicrocomputersKonstantin Levit-Gurevich 126

unsigned virtual_to_physical(unsigned la){unsigned pdir_index = la >> 22;unsigned* ptables = (unsigned*)0xC0000000;unsigned* pdir = (unsigned*)0xC0300000;// Check if Page Directory Entry is presentif (pdir[pdir_index] & 0x1 == 0x0)return 0xFFFFFFFF;// Check if 4MB pageif (pdir[pdir_index] & 0x80)return (pdir[pdir_index] & 0xFFC00000) +(la & 0x003FFFFF);// Check if Page Table Entry is presentif (ptables[la >> 12] & 0x1)return (ptables[la >> 12] & 0xFFFFF000) +(la & 0x00000FFF);}return 0xFFFFFFFF;Dr. K. Levit-Gurevich,May 31, 2006, p-127MicrocomputersKonstantin Levit-Gurevich 127

Windows 2000 Memory Layout0xFFFFFFFF1-GB System Space0xFFFFFFFF0xC0800000System CachePaged poolNonpaged pool0xC00000000xBFFFFFFF0xC0000000Process Page TablesKernel and executiveHALBoot Drivers3-GB addressspace optioncan be enabledby the /3GB flagin Boot.ini3-GB User Space0x800000000x7FFFFFFFApplication CodeGlobal Variablesper-thread StacksDLL codeDr. K. Levit-Gurevich,May 31, 2006, p-1280x000000000x00000000MicrocomputersKonstantin Levit-Gurevich 128

System Memory Map for Windows NT0xFFFFFFFF0xFFDFF0000xFFDFEFFF0xFB0000000xE57FFFFF0xE10000000xD8FFFFFF0xC10000000xC0FFFFFF0xC00000000xBFFFFFFF0xA00000000x9FFFFFFF0x80000000Processor ControlNon Paged SystemPaged PoolSystem CacheSystem TablesSystem View SpaceSystem CodeProcessor Control RegionSystem Drivers, Kernel Thread Stacks,NonPageable Pool, Pageframe DatabasePageable PoolWindows NT Cache Manager Mapping AreaPage Directory and Page TablesWin32 GDI Object Table, Win32 USER Object TableBoot Drivers, NT OS Kernel, HAL, TSS, GDT, IDTDr. K. Levit-Gurevich,May 31, 2006, p-129Konstantin Levit-GurevichMicrocomputers

Memory Model in Windows NT● Minimize usage of segments♦ Use paging for address and access check● Address error:Memory access beyond program limits is checked by the OSfollowing page fault exception● Access error:Memory Access is checked for read/write & privilege level in thepage table and the address segment attribute● There are different privilege levels for the OS and the user context(need to remember this during interrupt time)● The FS segment register is used as a Task pointer to the OSdatabase● WinNT Still supports 16-bit and real mode segmentationapplications with Virtual X86 (need to remember during Interruptservice)Dr. K. Levit-Gurevich,May 31, 2006, p-130Konstantin Levit-GurevichMicrocomputers

Windows NT Page TablesPageTable4KProcess 1’s CR3PageDirectoryPageTablePhysicalMemorypagesPageTable4KPageTable4KProcess 2’s CR3PageDirectoryPageTable4KPageTable4KSystem PageTableDr. K. Levit-Gurevich,May 31, 2006, p-131Konstantin Levit-GurevichMicrocomputers

Translation Lookaside Buffers (TLBs)●The processors stores the most recently used page-directory andpage-table entries in on-chip caches called TLBs.●The P6 family and Pentium processors have separate TLBs for thedata and instruction caches.●Also, P6 family processors maintain separate TLBs for 4-Kbyteand 4-Mbyte page sizes.●Translates the linear address to physical address for all memoryload and store.●Lookups in a cache array for the physical address of the page beingaccessed.●Caches page attributes with the physical address, and uses thisinformation to check for page protection faults and other pagingrelated exceptions.●Also caches the effective memory type with each page address.Dr. K. Levit-Gurevich,May 31, 2006, p-132MicrocomputersKonstantin Levit-Gurevich 132

Invalidating the TLBs●INVLPG instruction - invalidates the TLB for a specific page (is notaffected by the state of the G flag in PDE or PTE).●The following operations invalidate all TLB entries except global entries:♦Writing into control register CR3.♦A task switch that changes control register CR3.●The following operations invalidate all TLB entries, irrespective of thesetting of the G flag:♦Asserting or de-asserting the FLUSH# pin.♦Writing to an MTRR (with a WRMSR instruction).♦Writing to control register CR0 to modify the PG or PE flag.♦Writing to control register CR4 to modify the PSE, PGE, or PAE flag.Dr. K. Levit-Gurevich,May 31, 2006, p-133MicrocomputersKonstantin Levit-Gurevich 133

Paging Pros and Cons●Independence on physical memory size●No need to preserve continuous memory ranges●Physical memory is allocated only if accessed●Can map same linear address to different physical address●Can share data/code among processes●Pages have predefined size - easy memory management●Simple protection mechanism●Overhead of tables (insignificant)●Internal fragmentation (insignificant)●Translation time - reduced by TLBDr. K. Levit-Gurevich,May 31, 2006, p-134MicrocomputersKonstantin Levit-Gurevich 134

ProtectionDr. K. Levit-Gurevich,May 31, 2006, p-135Konstantin Levit-GurevichMicrocomputers

Protected Mode●Designed to provide amechanism protectingOperating System codeand data from beingcorrupted by userapplications.Reset orPE=0Real-AddressModePE = 1ResetorRSMSMI#●Elements:♦ Segmentation♦ Paging♦ Privilege levels♦ Privilege instructionsResetVM = 0Protected ModeVirtual-8086ModeVM = 1SMI#RSMSMI#RSMSystemManagementModeDr. K. Levit-Gurevich,May 31, 2006, p-136MicrocomputersKonstantin Levit-Gurevich 136

Protected Mode Bit in Control Registers31109876543210Reserved (set to 0)PCEPGEMCEPAEPSEDETSDPVIVMECR4OSXMMEXCPT31OSFXSR121154320Page-Directory BasePCDPWTCR3310Page-Fault Linear AddressCR2310CR13130292819181716156543210PGCDNWAMWPNEETTSEMMPPECR0Dr. K. Levit-Gurevich,May 31, 2006, p-137MicrocomputersKonstantin Levit-Gurevich 137

Switching to Protected Mode●Real Mode●Set CR0.PE=1♦ Protected 16-bit mode●Load CS with a 32-bit code segment♦ Protected 32-bit mode●Set CR0.PG=1♦ Protected Mode with Paging♦ Make sure that the current page is mapped 1-to-1 in page tablesDr. K. Levit-Gurevich,May 31, 2006, p-138MicrocomputersKonstantin Levit-Gurevich 138

Dr. K. Levit-Gurevich,May 31, 2006, p-139Switching to Protected Mode0x9008: 0x90100018 0x00000000 0x00000000 0x000000000x9018: 0x0000ffff 0x00cf9b00 0x0000ffff 0x00cf93000x9028: cli

Dr. K. Levit-Gurevich,May 31, 2006, p-140Protection Checks●Loading Segment Register♦ Privilege level checks♦ Segment type checksLLDTLTR●Memory Accesses♦ Segment level protection♦ Page level protection●Instruction Execution♦ Restriction of instruction set●Transfer Execution Control♦ When transforming control between code segments♦ Type checksFar CALL and JMP♦ Limit checks●All protection violation results in an exception being generated.MicrocomputersKonstantin Levit-Gurevich 140

Privilege Levels●Lower number => higher privilege●Code can access data of equal/lowerprivilege levels only●Code can call more privileged datavia “call gates”●Each level has its own stack!●Some instructions and I/O operationsare restricted to certain privilegelevels●Also referred to as:“Rings”OperatingSystemKernelOperatingSystemServicesApplicationsProtection RingsLevel 0Level 1Level 2Level 3Dr. K. Levit-Gurevich,May 31, 2006, p-141Konstantin Levit-GurevichMicrocomputers

Privilege Level Checks●CPL – Current Privilege Level● Privilege level of the currently executed program or task● Stored in bits 0 and 1 of the CS and SS segment registers●DPL – Descriptor Privilege Level● Privilege level of a segment or gate● Stored in the DPL field of the segment or gate descriptor●RPL – Requested Privilege Level● Override privilege level that is assigned to segment selectors● Stored in bits 0 and 1 of the segment selector● Used by OS to protect against user Trojan Horses – user code calls OScode and passes a pointer to a privileged data areaDr. K. Levit-Gurevich,May 31, 2006, p-142Konstantin Levit-GurevichMicrocomputers

Loading Data Segment Register●Stack Segment (SS)● Segment Selector is not Null ( ≥ 0x3 )● Segment index is within descriptor table limit● Segment Descriptor is marked as Present● Segment is writable data-segment● CPL = RPL = DPL●Data Segments (DS, ES, FS, GS)● Segment Descriptor is marked as Present● Segment index is within descriptor table limit● Segment is a data or readable code segment● Not both RPL and CPL are greater than DPL● Null selector can be loaded into DS, ES, FS, and GS without causingexception● Memory access with a Null segment causes exceptionDr. K. Levit-Gurevich,May 31, 2006, p-143Konstantin Levit-GurevichMicrocomputers

Privilege Check for Data AccessCS RegisterCPLSegment SelectorFor Data SegmentRPLPrivilegeCheckDPL >= max{RPL, CPL}otherwise: exception!Data-SegmentDescriptorDPLDr. K. Levit-Gurevich,May 31, 2006, p-144Konstantin Levit-GurevichMicrocomputers

Examples of Accessing Data Segments3CodeSegment CCPL = 3Segment Sel.E3RPL = 32CodeSegment ACPL = 2Segment Sel.E1RPL = 2DataSegment EDPL = 21CodeSegment BCPL = 1Segment Sel.E2RPL = 1CodeSegment D0Highest PrivilegeCPL = 0Dr. K. Levit-Gurevich,May 31, 2006, p-145MicrocomputersKonstantin Levit-Gurevich 145

Loading Code Segment Register●To transfer program control from one code segment to another, thesegment selector for the destination code segment must be loadedinto the code segment register (CS).●Program control transfers are carried out with● JMP, CALL, RET, INT n, IRET, SYSENTER and SYSEXIT instructions● Exception and interrupt mechanism●As part of CS loading process, the CPU examines the segmentdescriptor for the destination code segment and performs variouslimit, type, and privilege checks.Dr. K. Levit-Gurevich,May 31, 2006, p-146Konstantin Levit-GurevichMicrocomputers

Loading Code Segment Register●JMP and CALL instructions can reference another code segment inany of four ways:● Target operand contains the segment selector for the target code segment● Target operand points to a call-gate descriptor, which contains the segmentselector for the target code segment● Target operand points to a TSS, which contains the segment selector for thetarget code segment● Target operand points to a task gate, which points to a TSS, which in turncontains the segment selector for the target code segmentDr. K. Levit-Gurevich,May 31, 2006, p-147Konstantin Levit-GurevichMicrocomputers

Direct Calls and Jumps to Code SegmentCS RegisterCPLSegment SelectorFor Code SegmentRPLPrivilegeCheckNonconforming:RPL = DPLDPLCDr. K. Levit-Gurevich,May 31, 2006, p-148Konstantin Levit-GurevichMicrocomputers

Examples of Accessing Code SegmentsCodeSegment BSegment Sel.D2RPL = 33CPL = 3Segment Sel.C2RPL = 32CodeSegment ACPL = 2Segment Sel.C1Segment Sel.D1RPL = 2RPL = 2CodeSegment CDPL = 2NonconformingCode SegmentCodeSegment D1DPL = 1ConformingCode Segment0Highest PrivilegeDr. K. Levit-Gurevich,May 31, 2006, p-149MicrocomputersKonstantin Levit-Gurevich 149

Call Gate31161514131211876540Offset in Segment 31:16PDPL0Type1 1 0 00 0 0Param.Count43116150Segment Selector Offset in Segment15:00 0● Call Gate specifies♦ The code segment to be accessed♦ An entry point for a procedure in the specified code segment♦ The privilege level required for a caller trying to access the procedure♦ If a stack switch occurs, the number of optional parameters to be copiedbetween stacks♦ The size of values to be pushed onto the target stack: 16-bit or 32-bit♦ Whether the call-gate descriptor is validDr. K. Levit-Gurevich,May 31, 2006, p-150Konstantin Levit-GurevichMicrocomputers

Accessing a Code Segment ThroughFar Pointer to Call Gatea Call GateSegment SelectorOffsetRequired but not used by processorDescriptor TableOffsetSegment SelectorOffsetCall-GateDescriptor+Base Base Code-SegmentBaseDescriptorProcedureEntry PointDr. K. Levit-Gurevich,May 31, 2006, p-151Konstantin Levit-GurevichMicrocomputers

Privilege Check Rules for Call GateCS RegisterCPLCall-Gate selectorRPLCall Gate (Descriptor)DPLPrivilegeCheckmax{CPL,RPL}

Examples of Accessing Call Gates3CodeSegment ACPL = 3Gate SelectorAGate SelectorB3RPL = 3RPL = 3CallGate ADPL = 32CodeSegment BCPL = 2Gate SelectorB1RPL = 2CallGate BDPL = 21CodeSegment CCPL = 1Gate SelectorB2RPL = 1No StackSwitch OccursStack SwitchOccursCodeSegment DCodeSegment E0Highest PrivilegeDPL = 0ConformingCode SegmentDPL = 0NonconformingCode SegmentDr. K. Levit-Gurevich,May 31, 2006, p-153MicrocomputersKonstantin Levit-Gurevich 153

Inter Level Call - Stack Switching● On inter-level call: CPL changes, Control transferred, Stack switched● Stack Switch♦ New (privileged) SS:ESP is taken from the TSS♦ Old SS:ESP is stored on new stack♦ Parameters are copied to new stackCalling Procedure’s StackBefore CallCalled Procedure’s StackBefore ReturnSS 0:ESP 0from TSSCalling Procedure’s StackAfter Return(far RETn n = 3)XYZParameter 1Parameter 2Parameter 3ESP 3Calling SSCalling ESPParameter 1Parameter 2ESP 3Parameter 3Calling CSCalling EIPUnique for interlevel callsDr. K. Levit-Gurevich,May 31, 2006, p-154MicrocomputersKonstantin Levit-Gurevich 154

Privilege Instructions●An execution of privilege instructions in privilege level other than 0causes exception♦ LGDT♦ LLDT♦ LTR♦ LIDT♦ MOV CR♦ LMSW♦ CLTS♦ MOV DR♦ INVD♦ WBINVD♦ INVLPG♦ HLT♦ RDMSR/WRMSR♦ RDPMC♦ RDTSCDr. K. Levit-Gurevich,May 31, 2006, p-155– Load GDT register– Load LDT register– Load task register– Load IDT register– Load and store control registers– Load machine status word– Clear task-switch flag in register CR0– Load and store debug registers– Invalidate cache without write-back– Invalidate cache with write-back– Invalidate TLB entry– Halt processor– Read/Write Model-Specific Registers– Read Performance-Monitoring Counters– Read Time-Stamp CounterMicrocomputersKonstantin Levit-Gurevich 155

Fast System Call● SYSENTER instruction♦ Provides maximum performance for ring3 ring0 privilege level change♦ Introduced in Pentium-II processor● The following Model Specific Registers should be set prior execution:♦ SYSENTER_CS_MSR — Contains the segment selector for the privilege level 0code segment.♦ SYSENTER_EIP_MSR — Contains the offset into the privilege level 0 codesegment to the first instruction of the selected operating procedure or routine.♦ SYSENTER_ESP_MSR—Contains the stack pointer for the privilege level 0 stack.● SYSENTER operation:1. CS SYSENTER_CS_MSR2. EIP SYSENTER_EIP_MSR3. SS SYSENTER_CS_MSR + 84. ESP SYSENTER_ESP_MSR5. Switches to privilege level 0.6. EFLAGS.VM = 0Dr. K. Levit-Gurevich,May 31, 2006, p-156MicrocomputersKonstantin Levit-Gurevich 156

● SYSEXIT instructionDr. K. Levit-Gurevich,May 31, 2006, p-157Fast Return to User Level♦ Provides maximum performance for ring0 ring3 privilege level change♦ Introduced in Pentium-II processor● Players:♦ SYSENTER_CS_MSR — Contains the segment selector for the privilege level 0 codesegment.♦ EDX —Contains the offset into the privilege level 3 code segment to the firstinstruction to be executed in the user code.♦ ECX—Contains the stack pointer for the privilege level 3 stack.● SYSEXIT operation:1. CS SYSENTER_CS_MSR + 162. EIP EDX3. SS SYSENTER_CS_MSR + 244. ESP ECX5. Switches to privilege level 3.● Note: SYSENTER/SYSEXIT do not constitute a CALL/RET. No data is passedon the stack.MicrocomputersKonstantin Levit-Gurevich 157

Dr. K. Levit-Gurevich,May 31, 2006, p-158Protected ModeImplications and Observations (1)●Software conversion♦ Straightforward if software does not assume linear = seg*16+offset●Real mode “tricks” do not work. Usually for the better:♦ Cannot write on code♦ Cannot access segment 0♦ Cannot jump/write into the OS●New issues♦ How to write on code (self-modifying code, SMC) - use aliasing♦ How to use OS services - via call gates♦ How to access a known linear address -Map a segment to a known base. e.g., use 4G at base 0.♦ How to protect between processes (applications) -Use LDT per process. Reload LDTR at context switch●Additional related mechanisms exist - not discussedKonstantin Levit-GurevichMicrocomputers

Protected ModeImplications and Observations (2)●Segmentation performance impactIssues:♦ Indirect vs direct addressing♦ Protection checkSolution: information is cached for each active segment♦ Almost no overhead on memory reference... But♦ Change of a segment register is costly!●Segment based virtual memory♦ Possible - through use of “segment not present” bit♦ Unattractive - due to variable length of segmentsDr. K. Levit-Gurevich,May 31, 2006, p-159Konstantin Levit-GurevichMicrocomputers

Protected Mode in Practice●Value of segmentation/protected mode♦ Only option for a 16-bit OS♦ 32-bit OS prefers flat model, page based protection●Call gates used for system calls by Linux♦ Not by Windows NT●Task switch mechanism rarely if ever used●OS/2 uses 32-bit segmentation●Exotic uses for segment registers♦ FS used by NT for multi-threadingDr. K. Levit-Gurevich,May 31, 2006, p-160MicrocomputersKonstantin Levit-Gurevich 160

Interrupts and ExceptionsDr. K. Levit-Gurevich,May 31, 2006, p-161Konstantin Levit-GurevichMicrocomputers

What is it?“A signal from hardware or software, such as a keystroke,that demands immediate attention and takes priority overother operations”●Analogy:♦ Alarm clock♦ PainDr. K. Levit-Gurevich,May 31, 2006, p-162Konstantin Levit-GurevichMicrocomputers

What Causes It?●Interrupts♦ External to the CPU♦ Peripheral: I/O: mouse move/click, keyboard, timer●Exceptions♦ Internal to the CPU♦ Exceptions: Processor detected:Fault: restart from the causing instruction (e.g., page fault)Trap: restart from the next instruction (e.g., overflow)Abort: cannot restart (double fault)●Programmed Exceptions (“software interrupts”):♦ e.g.: INTO, INT3, INT n, BOUND●The terms “interrupts” & “exceptions” are frequently intermixedDr. K. Levit-Gurevich,May 31, 2006, p-163Konstantin Levit-GurevichMicrocomputers

Why Is It Important?●Inevitable: only way to implement a feature♦ Cannot implement timers, page faults, ^break etc. without it●Difficult without: alternatives imply huge program overhead♦ e.g., polling to identify keyboard stroke●Convenience: cheaper way to implement a feature♦ e.g., zero divide, INTO, BOUND, etc..●Conventions: Application/OS communication♦ System call. Allow using procedures without linkageDr. K. Levit-Gurevich,May 31, 2006, p-164Konstantin Levit-GurevichMicrocomputers

How Does It Work?●Internal interrupts: handled completely by CPU + SW●External interrupt involves CPU + SW + peripherals♦ External device delivers “INTR” signal (or NMI, SMI)♦ CPU acknowledges with “INTA” bus cycle♦ Device sends interrupt# on Data bus♦ Usually requires a PIC (Programmable Interrupt Controller) in system●External interrupts (except NMI/SMI) can be masked (IF=0)IntrDEVICE Signal PIC CPUIntANmi,..DataDr. K. Levit-Gurevich,May 31, 2006, p-165Konstantin Levit-GurevichMicrocomputers

Additional Players●Flags♦ Interrupt Flag EFLAGS.IF flag: masks external interruptsSet/Clear by STI/CLI instructions♦ Trap Flag EFLAGS.TF flag: is the CPU in TRAP mode (single-step)Set/Clear by EFLAGS manipulation♦ Resume Flag EFLAGS.RF flag: when set disables debug-exception●Instruction♦ IRET: “return from interrupt” = RET + few more things♦ Software interrupts: INT n, INT3, INTO (4), BOUND (interrupt 5)●Signals (for external interrupts)♦ INTR: interrupt request♦ INTA: interrupt acknowledge (bus cycle) = IO#*C*R#♦ Data Bus: pass interrupt #♦ NMI, SMI: additional interrupt linesNMI - non maskable (timer, power fail)SMI - system management (OS independent power management)Dr. K. Levit-Gurevich,May 31, 2006, p-166Konstantin Levit-GurevichMicrocomputers

Interrupt Handling●Hardware side♦ Ensure transparency to the affected taskWill return to the point of interruption w/ the right flagsDoes the minimum - only what software cannot do!● Interrupt handlers♦ Transparent: preserve all used registers!♦ Short as possible. If cannot - enable interrupts during handlingDo not disable interrupts for a long time!♦ Avoid nesting of same interrupts, or make sure the software can handlethat!♦ Some of the information can/should be extracted from the Stack(old CS:EIP, old SS:ESP, EFLAGS, error code)●Interrupt handler writers must fully understand the relevant HWinteraction (PIC + actual peripheral)Dr. K. Levit-Gurevich,May 31, 2006, p-167Konstantin Levit-GurevichMicrocomputers

Exception and Interrupt Vectors●Each exception and interrupt is associated with an identification number,called a vector.♦ Vectors 0-31 are assigned to the exceptions and NMI interrupt9,15,20-31 – Intel Reserved♦ Vectors 32-255 are designated and user defined interrupts.●Exception Classification♦ FaultAn exception that can generally be corrected and then, once corrected,allows the program to be restarted with no loss of continuity♦ TrapAn exception that is reported immediately following the execution of thetrapping instruction. Traps allow execution of a program or task to becontinued without loss of program continuity♦ AbortAn exception that does not always report the precise location of theinstruction causing the execution and does not allow restart of the programor task that caused the executionDr. K. Levit-Gurevich,May 31, 2006, p-168MicrocomputersKonstantin Levit-Gurevich 168

Protected-Mode ExceptionsNo.012345678101112131416171819#DE#DB-Mnemonic#BP#OF#BR#UD#NM#DF#TS#NP#SS#GP#PF#MF#AC#MC#XFDr. K. Levit-Gurevich,May 31, 2006, p-169Divide by ZeroDebugNMI InterruptBreakpointOverflowBOUND Range ExceededInvalid OpcodeDevice Not availableDouble FaultInvalid TSSSegment Not PresentStack-Segment FaultGeneral Protection FaultPage FaultDescriptionFloating-Point ErrorAlignment CheckMachine CheckStreaming SIMDExtensionsFaultFault/TrapInterruptTrapTrapTypeFaultFaultFaultAbortFaultFaultFaultFaultFaultFaultNoNoNoNoNoNoNoNoYes(Zero)YesYesYesYesYesNoErrorCodeDIV / IDIV instructionsDR conditions / INT1 instruction.Nonmaskable external interruptINT3 instructionINTO instructionSourceBOUND instructionUD2 instruction / reserved opcodeFP or WAIT/FWAIT instructionsAny exception, interrupt, NMITask switch or TSS accessLoading SR / accessing systemsegmentStack operations / SS register loadsMemory references / protection checksAny memory referenceFP or WAIT/FWAIT instructionsFault Yes Any data reference in memoryAbort (Zero) Model dependentFault No SIMD floating-point instructionsMicrocomputersKonstantin Levit-Gurevich 169

Exceptions and Interrupts PriorityPriority1 (Highest)2345678 (Lowest)DescriptionsHardware Reset and Machine ChecksTrap on Task SwitchExternal Hardware Interventions-FLUSH, STOPCLK, SMI, INITTraps on the Previous Instruction- Breakpoints, Debug TrapsExternal Interrupts- NMI, Maskable Hardware InterruptsFaults from Fetching Next Instruction- Code breakpoint, code-segment violation, Code page faultFaults from Decoding the Next Instruction- Instruction length > 15, illegal OpcodeFaults on Executing an InstructionDr. K. Levit-Gurevich,May 31, 2006, p-170Konstantin Levit-GurevichMicrocomputers

Dr. K. Levit-Gurevich,May 31, 2006, p-171Interrupt Descriptor Table●Interrupt Descriptor Table (IDT)♦ Located in virtual memory♦ Each entry contains 8-bytes InterruptDescriptor●Interrupt Descriptor Table Register(IDTR)♦ Contains:IDT Linear Base AddressIDT Limit in bytes, max 0xFFFF●Operating System loads IDTR byexecuting LIDT instruction♦ LIDT is a privilege instruction♦ Can be executed only in ring0●One can observe IDTR by performingSIDT instruction♦ SIDT can be executed in any privilegelevel47IDT Base Address+IDTR Register31IDT LimitInterruptDescriptor Table (IDT)Gate forInterrupt #nGate forInterrupt #3Gate forInterrupt #2Gate forInterrupt #1MicrocomputersKonstantin Levit-Gurevich 171161500(n-1)*81680

IDT DescriptorsTask gate31 16 15 1413 12 8 70DP P 0 0 1 0 1L3116150TSS Segment SelectorInterrupt Gate31 16 15 1413 12 8 7 5 40Offset 31..16PDPL0 D 1 1 0 0 0 031 16 150Segment Selector Offset 15..0Trap gate31 16 15 1413 12 8 7 5 40Offset 31..16PDPL0 D 1 1 1 0 0 031 16 150Segment Selector Offset 15..0Dr. K. Levit-Gurevich,May 31, 2006, p-172MicrocomputersKonstantin Levit-Gurevich 172

Linux IDT - ExampleDr. K. Levit-Gurevich,May 31, 2006, p-173MicrocomputersKonstantin Levit-Gurevich 173

Interrupt Procedure CallIDTDestinationCode SegmentInterruptVectorInterrupt orTrap GateOffset+InterruptProcedureSegment SelectorGDT or LDTBaseAddressSegmentDescriptorDr. K. Levit-Gurevich,May 31, 2006, p-174MicrocomputersKonstantin Levit-Gurevich 174

Stack Usage on Interrupt TransferStack Usage with No Privilege-Level ChangeInterrupted Procedure’sand Handler’s StackEFLAGSCSEIPError CodeESP BeforeTransfer to HandlerESP AfterTransfer to HandlerStack Usage with Privilege-Level ChangeInterrupted Procedure’sStackESP BeforeTransfer to HandlerESP AfterTransfer to HandlerHandler’s StackSSESPEFLAGSCSEIPError CodeDr. K. Levit-Gurevich,May 31, 2006, p-175MicrocomputersKonstantin Levit-Gurevich 175

Exception Error CodesGeneral Error Code31 16 153 2 1 0Segment Selector IndexTIIDTEXTIndex refers to descriptor in GDT (0) or LDT (1)Index refers to descriptor in IDT (1) or GDT/LDT (0)External (1) or internal (0) eventPage-Fault Error Code31 4 3 2 1 0RSVDU/SR/ PWDr. K. Levit-Gurevich,May 31, 2006, p-176Reserved Bits Violation (1)System (0) or user (1) codeRead (0) or Write (1) operationNonpresent page (0) or page-level protection violation (1)MicrocomputersKonstantin Levit-Gurevich 176

Dr. K. Levit-Gurevich,May 31, 2006, p-177Interrupt and Exception ClassesClassBenign Exception and InterruptsContributory ExceptionsPage FaultsVector Number1234567916171819AllAll01011121314Debug ExceptionNMI InterruptBreakpointOverflowBOUND Range ExceededInvalid OpcodeDevice Not AvailableCoprocessor Segment OverrunFloating-Point ErrorAlignment CheckMachine CheckSIMD floating-point extensionsSoftware Interrupts INT nExternal Interrupts INTRDivide ErrorInvalid TSSSegment Not PresentStack FaultGeneral ProtectionPage FaultDescriptionMicrocomputersKonstantin Levit-Gurevich 177

Conditions for Generating a Double FaultFirst ExceptionSecond ExceptionBenignContributoryPage FaultBenignHandle ExceptionsHandle ExceptionsHandle ExceptionsSeriallySeriallySeriallyContributoryHandle ExceptionsSeriallyGenerate a DoubleFaultHandle ExceptionsSeriallyPage FaultHandle ExceptionsSeriallyGenerate a DoubleFaultGenerate a DoubleFaultDr. K. Levit-Gurevich,May 31, 2006, p-178MicrocomputersKonstantin Levit-Gurevich 178

Triple Fault and Shutdown State●Triple fault is a case when another exception occurs whileattempting to call the double-fault handler, the processor entersshutdown mode.●Shutdown mode is similar to the state following execution of anHLT instruction.●In this mode the processor stops executing instructions until anNMI interrupt, SMI interrupt, hardware reset, or INIT# is received.Dr. K. Levit-Gurevich,May 31, 2006, p-179MicrocomputersKonstantin Levit-Gurevich 179

Example – Exercise● Load IDT with the limit equal to 0xFF(32 descriptors)♦ All External Interrupts descriptors (32-255) exceed IDT limit● When external interrupt is beingdelivered the processor does thefollowing steps:♦ Trying to access descriptor according tointerrupt vector♦ Detecting that the descriptor exceeds IDTlimit♦ Generating of a general-protection fault#GP(vector)♦ Error code contains the vector numberand IDT bit is set● #GP handler can check the IDT and EXTbits of the error code and retrieve theexternal interrupt vector in the selectorpart of the error code● Write an example of such #GP handler.InterruptDescriptor Table (IDT)Gate forInterrupt #255Gate forInterrupt #32Gate forInterrupt #31Gate forInterrupt #2Gate forInterrupt #00x1000Dr. K. Levit-Gurevich,May 31, 2006, p-180MicrocomputersKonstantin Levit-Gurevich 180

typedef struct IDTR_s {__int64 limit : 16,base : 32,res : 16;} IDTR_t;typedef struct Interrupt_Gate_s {__int64 offset_15_00 : 16,seg_sel : 16,res : 12,system : 1,dpl : 2,present : 1,offset_31_16 : 16;} Interrupt_Gate_t;unsigned handler;unsigned original_gp_handler;extern void exception_13(void);Dr. K. Levit-Gurevich,May 31, 2006, p-181MicrocomputersKonstantin Levit-Gurevich 181

void get_handler(unsigned error_code){Interrupt_Gate_t* idt;IDTR_tidtr;// Get IDTR__asm{sidt idtr}// Assign the base pointeridt = (Interrupt_Gate_t*)idtr.base;}// Get the required handler offsethandler = (idt[error_code >> 3].offset_31_16 > 3].offset_15_00;Dr. K. Levit-Gurevich,May 31, 2006, p-182MicrocomputersKonstantin Levit-Gurevich 182

void install_our_gp_handler(){Interrupt_Gate_t* idt;IDTR_tidtr;// Get IDTR__asm{sidt idtr}// Assign the base pointeridt = (Interrupt_Gate_t*)idtr.base;// Save the original GP handleroriginal_gp_handler = (idt[0xd].offset_31_16

Replace the GP handler with our handleridt[0xd].offset_15_00 = exception_13 & 0x0000FFFF;idt[0xd].offset_31_16 = exception_13 >> 16;// Cut the IDT limitidtr.limit = 0xFF}// Load IDTR with new limit__asm{lidt idtr}Dr. K. Levit-Gurevich,May 31, 2006, p-184MicrocomputersKonstantin Levit-Gurevich 184

Actual GP handler to be invoked upon any GP or External// Interruptexception_13: cli// Disable// interruptspush eax //mov eax, [esp+4]// Checkand eax, 7h // Errorcmp 3h, eax // Codepop eax //jne NATIVE_GP //NATIVE_GP:call get_handler //add esp, 4h// Remove Error// Code from stackjmp handler// Jump to original// int. handlerjmp original_gp_handler // Jump to// original GP// handlerDr. K. Levit-Gurevich,May 31, 2006, p-185MicrocomputersKonstantin Levit-Gurevich 185

Advanced Programmable InterruptController (I/O APIC)●Today part of the “chip-set”♦ Full software compatibility♦ Supports multiprocessor systems●Functions:♦ React to an external interrupt, according to priorities♦ Convert the interrupt request (IRQ) into a vector♦ Inform the peripheral that the interrupt was acknowledged●Vector addresses can be programmed♦ Using a set of “commands”Dr. K. Levit-Gurevich,May 31, 2006, p-186MicrocomputersKonstantin Levit-Gurevich 186

Local APICs and I/O APICCPU0CPU1CPU2CPU3Local APICLocal APICLocal APICLocal APICInterruptMessagesIPIsInterruptMessagesIPIsInterruptMessagesIPIsInterruptMessagesIPIsInterruptMessages3-wire APIC busExternalInterruptsI/O APICSystem Chip SetDr. K. Levit-Gurevich,May 31, 2006, p-187MicrocomputersKonstantin Levit-Gurevich 187

Dr. K. Levit-Gurevich,May 31, 2006, p-188Local APIC●Locally connected I/O devices.♦ Interrupts from I/O devices directly connected to CPU’s local interrupt pins.●Externally connected I/O devices.♦ Interrupts sent as I/O interrupt messages from the I/O APIC to one or more theIA-32 processors in the system.●Inter-processor interrupts (IPIs).♦ An IA-32 processor can use the IPI mechanism to interrupt another processoror group of processors on the system bus.●APIC timer generated interrupts.♦ The local APIC timer can be programmed to send a local interrupt to itsassociated processor when a programmed count is reached.●Performance monitoring counter interrupts.♦ Ability to send a interrupt to its associated processor when a performancemonitoringcounter overflows.●Thermal Sensor interrupts.♦ Ability to send an interrupt to themselves when the internal thermal sensor hasbeen tripped.●APIC internal error interrupts.♦ Ability to send an interrupt when an error is recognized within the local APICMicrocomputersKonstantin Levit-Gurevich 188

Local APIC Register Address MapAddress0xFEE000200xFEE000300xFEE000800xFEE000900xFEE000A00xFEE000B00xFEE000D00xFEE000E00xFEE000F00xFEE00100 through0xFEE00170Register NameLocal APIC ID RegisterLocal APIC Version Reg.Task Priority Register (TPR)Arbitration Priority Register (APR)Processor Priority Register (PPR)End Of Interrupt (EOI) Reg.Logical Destination RegisterDestination Format RegisterSpurious Interrupt Vector RegisterIn-Service Register (ISR)SoftwareRead/WriteRead/WriteRead OnlyRead/WriteRead OnlyRead OnlyWrite OnlyRead/WriteBits 0-27 RO;Bits 28-31 RWBits 0-8 RW;Bits 9-31 RORead OnlyDr. K. Levit-Gurevich,May 31, 2006, p-189MicrocomputersKonstantin Levit-Gurevich 189

Local APIC Register Address MapAddress0xFEE00180 through0xFEE001F00xFEE00200 through0xFEE002700xFEE002800xFEE003000xFEE003100xFEE003200xFEE003300xFEE003400xFEE003500xFEE00360Register NameTrigger Mode Register (TMR)Interrupt Request Register (IRR)Error Status RegisterInterrupt Command Register (ICR) [0-31]Interrupt Command Register (ICR) [32-64]LVT Timer RegisterLVT Thermal Sensor RegisterLVT Performance Monitor Counters Reg.LVT LINT0 RegisterLVT LINT1 RegisterSoftwareRead/WriteRead OnlyRead OnlyRead OnlyRead/WriteRead/WriteRead/WriteRead/WriteRead/WriteRead/WriteRead/WriteDr. K. Levit-Gurevich,May 31, 2006, p-190MicrocomputersKonstantin Levit-Gurevich 190

Local APIC Register Address MapAddress0xFEE003700xFEE003800xFEE003900xFEE003E0Register NameLVT Error RegisterInitial Timer Register (for Timer)Current Counter Register (for Timer)Divide Configuration Register (for Timer)SoftwareRead/WriteRead/WriteRead/WriteRead OnlyRead/WriteDr. K. Levit-Gurevich,May 31, 2006, p-191MicrocomputersKonstantin Levit-Gurevich 191

In-Service Register (ISR) andInterrupt Request Register (IRR)•Interrupt priority = vector / 16 (actually from 2 to 15)•Each priority level encompasses 16 vectors•The higher the vector number, the higher the priority within that priority level•The task priority is a software selected value between 0 and 15 written into theTask Priority Register (TPR)•The processor will service only those interrupts that have a priority higher thanspecified in TPR48 63 64 79 80 85IRRISR48 63 64 79 80 852TPR End Of Interrupt (EOI) EFLAGS.IF = 10Dr. K. Levit-Gurevich,May 31, 2006, p-192MicrocomputersKonstantin Levit-Gurevich 192

Interrupt Command Register (ICR)64 56 5532Destination FieldReserved31 20 19 18 1716 15 14 13 12 11 10 8 73 2 1 0ReservedVectorDestination Shorthand00: No Shorthand01: Self10: All Including Self11: All Excluding SelfTrigger Mode0: Edge1: LevelLevel0: De-assert1: AssertDelivery Mode000: Fixed001: Lowest Priority010: SMI100: NMI101: INIT110: Start UpDestination Mode0: Physical1: LogicalDelivery Status0: Idle1: Send PendingDr. K. Levit-Gurevich,May 31, 2006, p-193MicrocomputersKonstantin Levit-Gurevich 193

Local Vector Table (LVT)31 18 17 16 15 13 12 11 8 73 2 1 0TimerVectorTimer Mode0: One-shot1: PeriodicMask0: Not Masked1: MaskedInterrupt InputPin PolarityRemoteIRRTrigger Mode0: Edge1: LevelDelivery Status0: Idle1: Send PendingDelivery Mode000: Fixed010: SMI100: NMI111: ExtINT101: INIT31 17 11 10 8 73 2 1 0LINT0VectorLINT1VectorErrorVectorDr. K. Levit-Gurevich,May 31, 2006, p-194PerformanceMonitor CountersThermalSensor161514 13 12VectorVectorMicrocomputersKonstantin Levit-Gurevich 194

System ArchitectureSummaryDr. K. Levit-Gurevich,May 31, 2006, p-195MicrocomputersKonstantin Levit-Gurevich 195

Operation ModesCR0.PE=0Real-AddressModeorResetCR0.PE=1Reset orRSMSMI#ResetProtectedModeSMI#RSMSystemManagementModeEFLAGS.VM=0EFLAGS.VM=1Virtual-8086ModeSMI#RSM●16 bit♦ Real Mode (8086 and up)♦ Protected mode (286 and up)♦ Virtual 8086 mode (386 and up)Dr. K. Levit-Gurevich,May 31, 2006, p-196● 32 bit●●Protected mode (386 and up)Paging enabled/disabled● 64 bit modes are not shown hereMicrocomputersKonstantin Levit-Gurevich 196

Page Translation Mechanism (4KB)CR4.PAE=0Linear Address Space3122Linear AddressLinear Address2112 110Dir Table OffsetLinear AddressPage DirectoryPage TablePagePage Dir EntryPhysical AddressPage Table EntryCR3Dr. K. Levit-Gurevich,May 31, 2006, p-198MicrocomputersKonstantin Levit-Gurevich 198

Page Translation Mechanism (4KB)with Physical Address Extension CR4.PAE=1Directory PointerLinear Address3130 2921 2012 110Dir Table Offset9Page Directory9Page Table12Page2Page Dir EntryPhysical AddressPage Table EntryPage DirectoryPointer TableDir Pointer Entry4PDPTE * 512 PDE * 512 PTE = 2 20 pages32CR3 (PDBR)Dr. K. Levit-Gurevich,May 31, 2006, p-199MicrocomputersKonstantin Levit-Gurevich 199

Extended Memory64-Bit Technology(EM64T)Dr. K. Levit-Gurevich,May 31, 2006, p-202MicrocomputersKonstantin Levit-Gurevich 202

Introduction●What is EM64T (Extended Memory 64Bit Technology) ?♦ Native 64-bit support♦ Similar to the previous extension from 16-bit to 32-bit●Benefits of the architecture♦ Extended memory AddressingPrograms can easily cross the 4GB virtual memory limit.♦ Bringing 64-bit Computing into the MainstreamLow cost 64-bit machines♦ 32-bit Compatibility32-bit and 64-bit applications coexisting under a 64-bit operatingsystem.No emulation is required for 32-bit applicationsMigration can be performed graduallyDr. K. Levit-Gurevich,May 31, 2006, p-203MicrocomputersKonstantin Levit-Gurevich 203

Introduction●Benefits of the architecture (Cont’)♦ Ease of Porting Software from x86 to EM64TIt is not really new only an extension♦ 32-bit Applications Can Run More Efficiently32-bit applications can have access to the full 4GB address space.In a 64-bit operating system the kernel will be moved outside the 4GBrange allowing the application to use the entire 32-bit address space forcode and data.♦ Generate a more efficient codeAdditional RegistersIP relative addressing mode.Dr. K. Levit-Gurevich,May 31, 2006, p-204MicrocomputersKonstantin Levit-Gurevich 204

Architecture Overview●What are the main new features introduced in EM64T ?♦ Registers set changes♦ 64-bit flat linear addressing♦ New Operating Modes♦ Instruction Set Changes♦ Memory Organization.Dr. K. Levit-Gurevich,May 31, 2006, p-205MicrocomputersKonstantin Levit-Gurevich 205

General Purpose Register Set ExtensionIn x8663311570Added by EM64TRAXEAXAHALSSE127 0XMM0GPREAXx87790XMM7XMM8XMM8R8EIPXMM15R15Dr. K. Levit-Gurevich,May 31, 2006, p-206MicrocomputersKonstantin Levit-Gurevich 206

General Purpose Register Set Extension●EM64T provides a uniform set of low-byte, low-word, low-doubleword,and quadword registers♦ Well suited for register allocation by compilersIn x86Added by EM64T63RSIRDIRSPRBP31 7 0SILDILSPLBPLDr. K. Levit-Gurevich,May 31, 2006, p-207MicrocomputersKonstantin Levit-Gurevich 207

System Registers Set ExtensionIn x86Added by EM64TMSRsDescriptorsControlDebugGDTR & IDTREFER6331150Limit6331 0CR06331 0DR0Kernel GS BaseBaseCR2DR1STARCR3DR2LSTARCSTARFMASKGS BaseFS BaseLDTR &TR15 0Selector19 Attributes63 31 LimitBaseCR4CR8DR3DR4DR5DR6DR7Dr. K. Levit-Gurevich,May 31, 2006, p-208MicrocomputersKonstantin Levit-Gurevich 208

Operating ModesIn x86Added by EM64TReal ModeProtected ModeIA32e Mode64 bit Sub ModeCompatibility Sub ModeVirtual ModeSMM Mode● 64-bit and Compatibility sub modes are determined according to the CodeSegment RegisterDr. K. Levit-Gurevich,May 31, 2006, p-209MicrocomputersKonstantin Levit-Gurevich 209

64-bit Mode●Used by 64-bit applications under a 64-bit OS♦ New \ Extended Registers become available.♦ Extended \ New instructions become available.♦ Architectural support for up to 64 bits of linear address♦ Physical address support of up to 52 bits♦ Can use flat address space with a single code, data and stack space♦ A new RIP-relative data addressing mode.Dr. K. Levit-Gurevich,May 31, 2006, p-210MicrocomputersKonstantin Levit-Gurevich 210

Compatibility Mode●Permits legacy 16-bit and 32-bit applications to run, withoutrecompilation, under a 64-bit OS.●The following elements of legacy protected mode are not available whenin this mode:♦ Virtual 8086 mode.♦ Hardware task management.♦ From the OS point of view: system data structures, address translations,interrupt and exception handling all use the new 64-bit mechanisms.Dr. K. Levit-Gurevich,May 31, 2006, p-211MicrocomputersKonstantin Levit-Gurevich 211

Determining Sub Modes in IA32e●CPU mode is determined according to its CS♦ Default Operation Size bit in the Descriptor (Legacy bit)♦ Long mode bit in the Descriptor (New bit)ModeEFER.LMACS.LCS.DLegacy Legacy 16 0N/A 0Legacy 321IA32eCompatibility 16100Compatibility 320164-bit10Dr. K. Levit-Gurevich,May 31, 2006, p-212MicrocomputersKonstantin Levit-Gurevich 212

Activating IA32e mode●Start from Paging EnabledPGPECR0●Disable Paging●Enable physical address extensionPhysical Address of PML4CR3●Load CR3 with the physicaladdress of the Level 4 page maptable●Enable IA32e mode by settingEFER.LME bitLMAPAELMECR4EFER●Enable PagingDr. K. Levit-Gurevich,May 31, 2006, p-213MicrocomputersKonstantin Levit-Gurevich 213

Segmentation●In compatibility mode segmentation functions just as it does in legacy16-bit or 32-bit protected mode semantics.●In 64-bit mode segmentation is almost completely disabled, creating aflat 64-bit linear address space.♦ Segment register loads still perform all legacy checks on the values. It isrequired since it might be a segment that is going to be used by anapplication in compatibility mode♦ Code Segment still defines: Operating Mode, DPLDr. K. Levit-Gurevich,May 31, 2006, p-214MicrocomputersKonstantin Levit-Gurevich 214

SegmentationLegacy and Compatibility modes64-Bit modeCSDSESFSGSSSCSDSESFSGSSSAttributes onlyIgnoredBase OnlyIgnored●FS,GS segment bases continue to be used. This facilitates addressinglocal data and certain OS structures.●FS and GS base parts of the descriptor are mapped to MSRs (FS_BASE& GS_BASE).Dr. K. Levit-Gurevich,May 31, 2006, p-215MicrocomputersKonstantin Levit-Gurevich 215

Segmentation64 – Bit ModeCompatibility Mode64Virtual Address0150Selector31Effective Add.0PagingSegmentation51Physical Address064032 31 0Virtual Add.Paging51Physical Address0Dr. K. Levit-Gurevich,May 31, 2006, p-216MicrocomputersKonstantin Levit-Gurevich 216

Canonical Addressing●IA32e mode defines a 64-bits linear address.♦ Canonical Addressing mechanism allows implementations to support less.♦ the first implementation supports 48-bits of linear address.●Definition: “An address is in canonical form if the address bits fromthe most significant implemented bit up to bit 63 are set to either allones or all zeros”●Linear address size derives from CPUID values●Any linear-address reference which is not in canonical form generatesan exception.Dr. K. Levit-Gurevich,May 31, 2006, p-217MicrocomputersKonstantin Levit-Gurevich 217

Canonical AddressingValid RangeInvalid Range64-bitLinear SpaceValid RangeDr. K. Levit-Gurevich,May 31, 2006, p-218MicrocomputersKonstantin Levit-Gurevich 218

Canonical AddressingExample:Canonical address Size = 4863 47310? ? ? ? ? ? ? ? F F0000000000000000000000Question: What is the canonical form of this address?Answer: Bit 47 is set and therefore the address is:63 47310FFFFFFFFF F0000000000000000000000Dr. K. Levit-Gurevich,May 31, 2006, p-219MicrocomputersKonstantin Levit-Gurevich 219

Paging●EM64T expands physical address extension (PAE) paging structure topotentially support mapping a 64-bit linear address into a 52-bit physicaladdress.♦ The first implementation support translation of 48-bit linear address into 40-bit physical address.♦ Physical address size derives from CPUID values.●Prior to activating IA32e mode CR4.PAE must be set.♦ PAE extends paging structures to support 64 bits addresses.Dr. K. Levit-Gurevich,May 31, 2006, p-220MicrocomputersKonstantin Levit-Gurevich 220

Paging4Kb Page Translation63 48 47 39 38 30 29 21 20 12 110Linear AddressSign ExtendPML4E OffsetPDPE OffsetPDE OffsetPTE OffsetPage Offset9Page MapLevel 49Page DirectoryPointer9Page DirectoryEntry9Page TableEntry4-Kb PageIn PhysicalMemory1236512 PML4E * 512 PDPE * 512 PDE * 512 PTE = 2 4-Kb pagesDr. K. Levit-Gurevich,May 31, 2006, p-221MicrocomputersKonstantin Levit-Gurevich 221

Paging2Mb Page Translation63 48 47 39 38 30 29 21 200Linear AddressSign ExtendPML4E OffsetPDPE OffsetPDE OffsetPage Offset9Page MapLevel 49Page DirectoryPointer9Page DirectoryEntry2-Mb PageIn PhysicalMemory2127512 PML4E * 512 PDPE * 512 PDE = 2 2-Mb pagesDr. K. Levit-Gurevich,May 31, 2006, p-222MicrocomputersKonstantin Levit-Gurevich 222

Instruction Set Changes●REX – a New family of instruction prefixes used in 64-bit mode.●●●Specify the new GPRs and SSE registersSpecify a 64-bit operand sizeSpecify extended control registers●Not all instructions require a REX prefix●An instruction can have only one REX prefix●The legacy instruction size limit of 15 bytes still applies to instructionscontain a REX prefixDr. K. Levit-Gurevich,May 31, 2006, p-223MicrocomputersKonstantin Levit-Gurevich 223

Instruction Set Changes●This is how the REX prefix fits within the byte order of instructionsLocation is criticalLegacyPrefixesREXOpcode Prefix ModR/M SIB Displacement ImmediateGrp1-4(optional)(optional) 1-2 or 3byteopcode1 byte(if req.)1 byte(if req.)1,2 or 4Bytes(if req)1,2 or 4Bytes(if req)Dr. K. Levit-Gurevich,May 31, 2006, p-224MicrocomputersKonstantin Levit-Gurevich 224

Instruction Set ChangesField NameBit PositionDefinition-7:40100WR320 = Default Operand size1= 64 Bit Operand sizeExtension of the ModR/M reg fieldX1Extension of the SIB index fieldB0Extension of the ModR/M field,SIB base field or Opcode reg fieldDr. K. Levit-Gurevich,May 31, 2006, p-225MicrocomputersKonstantin Levit-Gurevich 225

Instruction Set Changes●Let’s get the feeling of how and why REX prefix is used.mov r8, r9// Register–Register addressing (no memory)REX PrefixOpcodemodregr/m0100 W R 0 B11R R RB B B64bitOperandsizeRR R RBB B BIn this Case the instruction bytes would be: 0x4d 0x8b 0xc1Dr. K. Levit-Gurevich,May 31, 2006, p-226MicrocomputersKonstantin Levit-Gurevich 226

Instruction Set Changesmov r8, [r9]// Register–Memory addressing (no SIB byte)REX PrefixOpcodemodregr/m0100 W R 0 B00R R RB B B64bitOperandsizeRR R RBB B BIn this Case the instruction bytes would be: 0x4d 0x8b 0x1Dr. K. Levit-Gurevich,May 31, 2006, p-227MicrocomputersKonstantin Levit-Gurevich 227

Instruction Set Changesmov r8, [r9 + r10*2]// Memory addressing with SIB byteREX PrefixOpcodemodregr/mscaleindexbase0100 W R X B00R R R100!11R R RB B B64bitOperandsizeRR R R X X X X B B B BIn this Case the instruction bytes would be: 0x4f 0x8b 0x04 0x51Dr. K. Levit-Gurevich,May 31, 2006, p-228MicrocomputersKonstantin Levit-Gurevich 228

Instruction Set Changespush r8// Register operand coded in opcode byteREX Prefix0100 W 0 0 BOpcoderegB B B64bitOperandsizeBB B BIn this Case the instruction bytes would be: 0x49 0x50Dr. K. Levit-Gurevich,May 31, 2006, p-229MicrocomputersKonstantin Levit-Gurevich 229

Instruction Set ChangesDefault Address & Operand SizeDefaultDefaultRegisterGPRModeAddressSizeOperandSizeExtensionWidth64-bit6432Yes64IA-32eCompatibility3232No32161616, 8Dr. K. Levit-Gurevich,May 31, 2006, p-230MicrocomputersKonstantin Levit-Gurevich 230

●RIP – Relative Addressing●●●●Instruction Set ChangesA new addressing form implemented in 64 bit modeUsing a signed 32-bit displacement an offset range of +\- 2 GB from the RIPis provided.In legacy IA32, addressing relative to the IP is available only with controltransfer instructionsAllows to generate position independent code.ApplicationCodemov eax, aVaraVarData• Without RIP Relative addressing the loader is required topatch the address during load of the applicationDr. K. Levit-Gurevich,May 31, 2006, p-231MicrocomputersKonstantin Levit-Gurevich 231

Instruction Set Changes●Let’s assume:♦ CPU in 64-bit mode♦ RAX = 0002_0001_8000_2201♦ RBX = 0002_0002_0123_3301●Example 1: 64-bit Add: ADD RBX, RAX♦ RBX = 0004_0003_8123_5502●Example 2: 32-bit Add: ADD EBX, EAX♦ RBX = 0000_0000_8123_5502 (32 bit result Is zero extended)●Example 3: 16-bit Add: ADD BX, AX♦ RBX = 0002_0002_0123_5502 (bits 63:16 are preserved)●Example 4: 8-bit Add: ADD BL, AL♦ RBX = 0002_0002_0123_3302 (bits 63:08 are preserved)Dr. K. Levit-Gurevich,May 31, 2006, p-232MicrocomputersKonstantin Levit-Gurevich 232

SYSCALL \ SYSRET instructions●New Instructions available only in 64-Bit mode●Provides a fast call to privilege level 0 system procedure.♦ Relevant to OS’s that use a flat memory model with no segmentation.(Windows and Linux)♦ The simplification comes from eliminating unneeded checks and memoryreferences●A similar mechanism to SYSENTER \ SYSEXIT instructionsDr. K. Levit-Gurevich,May 31, 2006, p-233MicrocomputersKonstantin Levit-Gurevich 233

SYSCALL instructionPrivilege Level3OLD RIPOLD EFLAGSRCXR110LSTAR MSRFMASK MSR & EFLAGSSTAR MSRRIPEFLAGSCSSSDr. K. Levit-Gurevich,May 31, 2006, p-234MicrocomputersKonstantin Levit-Gurevich 234

SWAPGS●New Instruction available only in 64-Bit mode●Provides a fast method for system software to load a pointer to systemdata structures.●The need for such an instruction arises when using SYSCALL toimplement system calls.♦ No kernel stack exists at the OS entry pointGS_BASEKernel GS BASEDr. K. Levit-Gurevich,May 31, 2006, p-235MicrocomputersKonstantin Levit-Gurevich 235

Interrupt Handler in Linux – Examplecommon_interrupt:cldsubq $9*8, %rsp //movq %rdi, 8*8(%rsp) //movq %rsi, 7*8(%rsp) // Savemovq %rdx, 6*8(%rsp) // Generalmovq %rcx, 5*8(%rsp) // Purposemovq %rax, 4*8(%rsp) // Registersmovq %r8, 3*8(%rsp) //movq %r9, 2*8(%rsp) //movq %r10, 1*8(%rsp) //movq %r11, (%rsp) //leaq 9*8(%rsp), %rditestl $3, 8(%rdi) // Check CPLje 1fswapgs// Swap user & kernel GS1: addl $1, %gs:pda_irqcountcall do_IRQ // Call specific handlerDr. K. Levit-Gurevich,May 31, 2006, p-236MicrocomputersKonstantin Levit-Gurevich 236

Interrupt Handler in Linux – Exampleclisubl $1, %gs:pda_irqcountleaq -9*8(%rdi),%rsptestl $3, 8(%rdi) // Check CPLje 2fswapgs// Swap kernel & user GS2: movq (%rsp), %r11 //movq 1*8(%rsp), %r10 //movq 2*8(%rsp), %r9 //movq 3*8(%rsp), %r8 // Restoremovq 4*8(%rsp), %rax // Generalmovq 5*8(%rsp), %rcx // Purposemovq 6*8(%rsp), %rdx // Registersmovq 7*8(%rsp), %rsi //movq 8*8(%rsp), %rdi //addq $9*8, %rsp //iretq// Return from interruptDr. K. Levit-Gurevich,May 31, 2006, p-237MicrocomputersKonstantin Levit-Gurevich 237

Other new Instructions●Many instructions were promoted to 64-Bits such as:♦ Arithmetic Calculations – ADD, SUB, DIV, MUL♦ Boolean Bitwise Operations – NOT, OR, XOR, AND♦ Rotation Instruction – SHL, SHR, SHLD, SHRD●SSE instructions can access XMM8-XMM15Dr. K. Levit-Gurevich,May 31, 2006, p-238MicrocomputersKonstantin Levit-Gurevich 238

Invalid Instructions●The following instructions are invalid in 64-bit mode:♦ AAA, AAD, AAM, AAS, DAA, DAS♦ BOUND – Check Array Bounds♦ JMP, CALL (far absolute)♦ INTO♦ LDS, LES♦ POP DS, POP ES, POP SS♦ POPA, POPAD, PUSHA, PUSHAD♦ PUSH CS,DS,ES,SS♦ DEC, INC (one byte opcode only) – becomes a REX prefix.Dr. K. Levit-Gurevich,May 31, 2006, p-239MicrocomputersKonstantin Levit-Gurevich 239

Deactivating IA32e mode●Start from Compatibility sub modePGPECR0●Disable Paging●Load CR3 with the physicaladdress of the legacy page tabledirectory base addressPhysical Address of Legacy PDPAECR3CR4●Disable IA32e mode by settingEFER.LME bitLMALMEEFER●Enable PagingDr. K. Levit-Gurevich,May 31, 2006, p-240MicrocomputersKonstantin Levit-Gurevich 240

System Descriptors●Loading the GDT and IDT :GDTR & IDTR1506331LimitBaseIn 64-bit mode●Loading the LDT and TR :LDTR &TR150Selector19 Attributes6331LimitBaseDr. K. Levit-Gurevich,May 31, 2006, p-241In 64-bit modeMicrocomputersKonstantin Levit-Gurevich 241

Instruction Set ChangesAddress Size PrefixesDefaultEffectiveAddressModeAddressSizeAddressSizePrefixRequired64-bit6432YesIA-32eCompatibility32 32 No1616Dr. K. Levit-Gurevich,May 31, 2006, p-242MicrocomputersKonstantin Levit-Gurevich 242

Instruction Set Changes●Displacement in 64 bit mode●Remain 8 bits or 32 bits and are sign extended to 64 bits●Direct memory-offset MOVs●extended to specify a 64-bit memory offset immediate absolute address.(called moffset)●Immediates in 64 bit mode●●●By default remain 32 bitsWhen using 64 bits operand size the processor sign-extends all immediates to64-bits prior to their use.Support for 64-bits immediate is accomplished by extending MOVreg,imm16/32●Branches●●Near branch operand size if forced to 64-bit (No need for REX)Call Gates are extended to support 64-bit offsetDr. K. Levit-Gurevich,May 31, 2006, p-243MicrocomputersKonstantin Levit-Gurevich 243

Instruction Set Changes●When performing 32-bit operations with a GPR destination in 64-Bitmode, the processor zero extends the 32-bit result into the full 64-bitdestination.●8-bit and 16-bit operations on GPR’s preserve all unwritten upper bits ofthe destination GPR.●Software should explicitly sign extend the result of 8-bit, 16-bit and 32bit operations to the full 64-bit width before using the result in 64-bitaddress calculations.●After a processor mode change the high 32-Bits of the 64-bit GPR’sbecomes undefined.Dr. K. Levit-Gurevich,May 31, 2006, p-244MicrocomputersKonstantin Levit-Gurevich 244

SYSRET instructionPrivilege Level3RCXR11STAR MSRRIPEFLAGSCSSS0Dr. K. Levit-Gurevich,May 31, 2006, p-245MicrocomputersKonstantin Levit-Gurevich 245

No Execute (NX) Page Protection●A new paging protection that prevents data pages from being used bymalicious software to execute code.●Available only in PAE mode (Legacy and IA32e mode)●Can be activated \ deactivatedPage Table Entry63NX0New BitDr. K. Levit-Gurevich,May 31, 2006, p-246MicrocomputersKonstantin Levit-Gurevich 246

Interrupt and Trap GatesReserved+12Target Offset 63-32+8Target Offset 31-16 P DPL S Type Reserved IST+4Target SelectorTarget Offset 15-00+031 150Dr. K. Levit-Gurevich,May 31, 2006, p-247MicrocomputersKonstantin Levit-Gurevich 247

Floating PointDr. K. Levit-Gurevich,May 31, 2006, p-248Konstantin Levit-GurevichMicrocomputers

Dr. K. Levit-Gurevich,May 31, 2006, p-249Floating Point● Supports single, double and extended memory format (32, 64 and 80bits), as well as other exotic formats (BCD, integers)● 8 registers arranged as a stack● A stack entry (register) is always 80 bits; conversion done duringload/store only● Each operation works on top of stack and optionally on another element(another stack entry of memory). Stack top moves!● Supports all IEEE-754 operations♦ fadd, fsub, fmul, fdiv, fsqrt, fcmp...● and many otherstrig, transcendentals, etc...● Separate unit (“co-processor”) for 8086, 286, 386. On-chip since i486♦ Exception model reflects this history● Big performance boost in i486, large in Pentium, huge in Pentium-IIKonstantin Levit-GurevichMicrocomputers

Floating Point Formats● General Format: (-1) S 2 E (b 0∆b 1b 2b 3..b p-1)♦ Where:● SpecificsS = 0 or 1E = any integer between Emin and Emax, inclusiveb i= 0 or 1p = number of bits of precisionParameter Single Double ExtendedFormat width in bits 32 64 80p (bits of precision) 24 53 64 2’s complementExponent width in bits 8 11 15 Biased formatEmax +127 +1023 +16383Emin -126 -1022 -16382Exponent bias +127 +1023 +16383● Biased Exponent means that a constant is added to the actual exponent so thatthe biased exponent is always a positive number.● b 0is implied in single/double, explicit in extendedDr. K. Levit-Gurevich,May 31, 2006, p-250Konstantin Levit-GurevichMicrocomputers

Floating Point Formats●Example of single precision number:3123220S Exponent SignificandFractionInteger or J-BitDr. K. Levit-Gurevich,May 31, 2006, p-251MicrocomputersKonstantin Levit-Gurevich 251

Normalized and De-Normalized Numbers●Normalized Numbers♦ Except for zero, the significand is always made up of an integer of 1and the following fraction1.fff…ffFor values less than 1, the leading zeros are eliminatedFor each leading zero eliminated, the exponent is decremented byoneExample0.125 10= 0.001 2= 1.0 E 2-3●De-Normalized (Tiny) Numbers♦ When the biased exponent is zero, smaller numbers can only berepresented by making the integer bit of the significand zero♦ A denormalized number is computed through a technique calledgradual underflowDr. K. Levit-Gurevich,May 31, 2006, p-252MicrocomputersKonstantin Levit-Gurevich 252

De-Normalization Process ExampleOperationSignExponentSignificand(unbiased)True Result0-1291.01011100000…00Denormalize0-1280.10101110000…00Denormalize0-1270.01010111000…00Denormalize0-1260.00101011100…00Denormal Result0-1260.00101011100..00Dr. K. Levit-Gurevich,May 31, 2006, p-253MicrocomputersKonstantin Levit-Gurevich 253

Real Number NotationNotationValueOrdinary Decimal178.125Scientific Decimal1.78125E 102Ordinary Binary10110010.001Scientific Binary1.0110010001E 2111Scientific Binary(Biased Exponent)Single-Real Format1.0110010001E 210000110SignBiased ExponentNormalized Significant010000110011001000100000000000001. (Implied)Dr. K. Levit-Gurevich,May 31, 2006, p-254Konstantin Levit-GurevichMicrocomputers

Special Numbers●Signed Zeros♦ Exponent and significant are 0♦ Sign: +0 or –0Both are equal in valueThe sign result depends on the operation being performed andthe rounding mode being used●Signed Infinities♦ + ∞ or – ∞♦ Represent the maximum positive and negative real numbers that canbe represented in floating-point format♦ Significant is 0♦ Exponent is all 1’s♦ Single precision example+ ∞ = 0 11111111 00000000000000000000000– ∞ = 1 11111111 00000000000000000000000♦ Special rules for 0*inf, etc...Dr. K. Levit-Gurevich,May 31, 2006, p-255Konstantin Levit-GurevichMicrocomputers

●NaNs – Not a Number♦ Maximum allowed biased exponent♦ Non-zero fraction♦ Sign bit is ignoredSpecial Numbers♦ Quite NaNs – QNaNsMost significant fraction bit setAllowed to propagate through most arithmetic operationswithout signaling an exception♦ Signaling NaNs – SNaNsMost significant fraction bit clearGenerally signal an invalid operation exception whenever theyappear as operands in arithmetic operations.♦ Used to propagate errors in computations when exceptions aremaskedDr. K. Levit-Gurevich,May 31, 2006, p-256Konstantin Levit-GurevichMicrocomputers

Normalized and De-Normalized Finite NumbersNaN-inf-NormalizedFinite-DenormalizedFinite-0 +0 +DenormalizedFinite+NormalizedFinite+infNaNS E F1 0 0S E F-0 +0 0 0 01 0 0.XXX-DenormalizedFinite+DenormalizedFinite0 0 0.XXX1 1...254 Any Value-NormalizedFinite+NormalizedFinite0 1...254 Any Value1 255 0 -inf +inf 0 255 0X 255 1.0XX -SNaN +SNaN X 255 1.0XXX 255 1.1XX -QNaN +QNaN X 255 1.1XXDr. K. Levit-Gurevich,May 31, 2006, p-257Konstantin Levit-GurevichMicrocomputers

Floating Point RegistersSign79786463Data Registers0FP7ExponentSignificandFP6FP5FP4FP3FP2FP1FP015ControlRegister047Last Instruction Pointer0StatusRegisterLast Data (Operand) PointerTagRegister10Opcode0Dr. K. Levit-Gurevich,May 31, 2006, p-258MicrocomputersKonstantin Levit-Gurevich 258

FPU Data Register Stack●FPU data registers are treated as a register stack●All addressing of the data registers is relative to the register on topof stack●The register number of the current top-of-stack register is stored inthe TOP field in the FPU status word●Load operations decrement TOP by one and load a value into thenew top-of-stack register (equivalent to PUSH)●Store operations store the value from the current TOP register inmemory and then increment TOP by one (equivalent to POP)●If a load operation is performed when TOP is at 0, registerwraparround occurs and the new value of TOP is set to 7Dr. K. Levit-Gurevich,May 31, 2006, p-259MicrocomputersKonstantin Levit-Gurevich 259

Example FPU Dot Product ComputationComputation: Dot Product(5.6, 3.8) X (2.4, 10.3) = (5.6 X 2.4) + (3.8 X 10.3)Code:FLD value1; (a) value1 = 5.6FMUL value2; (b) value2 = 2.4FLD value3; value3 = 3.8FMUL value4; (c) value4 = 10.3FADD ST(1); (d)R7(a) (b) (c) (d)R7R7R7R6R6R6R6R5R5R5R5R45.6ST(0)R413.44ST(0)R413.44ST(1)R413.44ST(1)R3R3R339.14ST(0)R352.58ST(0)R2R2R2R2R1R1R1R1R0R0R0R0Dr. K. Levit-Gurevich,May 31, 2006, p-260MicrocomputersKonstantin Levit-Gurevich 260

Floating Point - Control●Rounding modes●Precision control●Exception Masks15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0XRCPCInfinity ControlRounding ControlPrecision ControlPMUMOMZMDMIMException MasksPrecisionUnderflowOverflowZero DivideDenormalized OperandInvalid OperationRounding Control00 - Round to Nearest or Even01 - Round Down (Toward -∞)10 - Round Up (Toward +∞)11 - Chop (Truncate Toward 0)Precision control00 - 24 Bits (Single Precision)01 - (Reserved)10 - 53 Bits (Double Precision)11 - 64 Bits (Extended Precision)* Infinity Control has no effect on 386 and on.Used for compatibility with 8086 & 286Dr. K. Levit-Gurevich,May 31, 2006, p-261Konstantin Levit-GurevichMicrocomputers

Floating Point – Status Word●Exception Flags●Top Of Stack position● Complex exception model andhandling.Not discussed.●ES is set if any unmaskedexception bit is set;Cleared otherwise●Top values:000 = Register 0 is top of stack001 = Register 1 is top of stack....111 = Register 7 is top of stackConditionCode15 14 13 11 10 9 8 7 6 5 4 3 2 1 0BC3Error Summary StatusStack FaultException FlagsPrecisionUnderflowOverflowZero DivideTOPDenormalized OperandInvalid OperationC2FPU BusyTop of Stack PointerC1C0ESSFPEUEOEZEDEIEDr. K. Levit-Gurevich,May 31, 2006, p-262MicrocomputersKonstantin Levit-Gurevich 262

Floating Point – Tag Word●Flag each stack value as Valid, Empty, Zero, Special15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Tag(7) Tag(6) Tag(5) Tag(4) Tag(3) Tag(2) Tag(1) Tag(0)Tag Values00 = Valid01 = Zero10 = Special: Invalid (NaN, Unsupported, Infinity, or Denormal)11 = EmptyDr. K. Levit-Gurevich,May 31, 2006, p-263Konstantin Levit-GurevichMicrocomputers

Another Example – Quadratic EquationCOMM _x1:QWORD ; S E P_a DD 040b00000r ; 5.5 0 81 (1.)6000_b DD 040800000r ; 4 0 81 (1.)0_c DD 03e800000r ; 0.25 0 7B (1.)0_i2 DD 02H_$T32 DD 040800000r ; 4 TOS; 0fld DWORD PTR _b ; b 7fchs ; -bfld DWORD PTR _b ; b 6fmul DWORD PTR _b ; b 2fld DWORD PTR $T32 ; 4 5fmul DWORD PTR _a ; 4afmul DWORD PTR _c ; 4acfsubp ST(1), ST(0) ; b 2 -4ac 6fsqrt; D =sqrt(b 2 -4ac)faddp ST(1), ST(0) ; -b+D 7fild DWORD PTR _i2 ; 2 6fmul DWORD PTR _a ; 2*afdivp ST(1), ST(0) ; (-b+D)/2a 7fstp QWORD PTR _x1 ; store 0Dr. K. Levit-Gurevich,May 31, 2006, p-264Konstantin Levit-GurevichMicrocomputers

MMXDr. K. Levit-Gurevich,May 31, 2006, p-265Konstantin Levit-GurevichMicrocomputers

MMX● An extension to the X86 Architecture● Useful mainly for multimedia applications● Technique: Single Instruction, Multiple Data (SIMD) withexisting registers● New data types - packed 64 bit consists of either♦ 8 bytes♦ 4 words♦ 2 doublewords● Instructions to operate on this data● Software model to allow easy migration with existing OSDr. K. Levit-Gurevich,May 31, 2006, p-266Konstantin Levit-GurevichMicrocomputers

MMX Registers and Data Types63MM7MM60MM5MM4MM3MM2MM1MM063Packed Byte Integers063Packed Word Integers063Packed Doubleword Integers0Dr. K. Levit-Gurevich,May 31, 2006, p-267MicrocomputersKonstantin Levit-Gurevich 267

MMX Instructions● Operations (57 instructions)♦ Arithmetic (saturating & wrap-around)Add, Subtract, Multiply, Multiply-add (for fast multiply accumulate)♦ LogicalAnd, Or, Shifts, Compares (including arithmetic)♦ Conversions♦ TransfersPack from large to smaller wordsUnpack from small to large wordsRegister to registerLoad from and store to memory● Integration into Intel Architecture♦ 8 new registers - MM0- MM7♦ Uses previously reserved opcodes♦ Uses same addressing modesDr. K. Levit-Gurevich,May 31, 2006, p-268MicrocomputersKonstantin Levit-Gurevich 268

Example of OperationsPacked AddPADDW: Add Packed worda3 a2 a1+ + + +b3 b2 b1FFFFh8000hPADDUSW: Add Unsigned Packed word with Saturatea3 a2 a1+ + + +b3 b2 b1FFFFh8000ha3+b3 a2+b2 a1+b17FFFha3+b3 a2+b2 a1+b1FFFFhMultiply-AddPMADDWD: multiply and adda3 a2 a1 a0* * * *b3 b2 b1 b0ShiftPSLLQ: Shift left logical all 64-bita3 a2 a1a00010ha3*b3+a2*b2 a1*b1+a0*b0a2a1a00000hDr. K. Levit-Gurevich,May 31, 2006, p-269Konstantin Levit-GurevichMicrocomputers

Example OperationsMultiply-AddPMADDWD: multiply and adda3 a2 a1 a0* * * *b3 b2 b1 b0Multiply-LowPMULLW: multiply lowa3 a2 a1 a0b3 b2 b1 b0a3*b3+a2*b2a1*b1+a0*b0a3*b3a2*b2a1*b1a0*b0Multiply-HighPMULLHW: multiply higha3 a2 a1 a0c3 c2 c1 c0low order 16 bitsb3 b2 b1 b0a3*b3a2*b2a1*b1a0*b0Dr. K. Levit-Gurevich,May 31, 2006, p-270c3 c2 c1 c0High order 16 bitsKonstantin Levit-GurevichMicrocomputers

FP/MMX view of registers●MMX registers MM0-MM7overlap FP registers♦ All tags are valid w/ MMX♦ TOS is 0●Pros♦ No change of context♦ Easy OS migration●Cons♦ Difficult to Intermix MMX& FP♦ Need to signal MMX to FPmove(EMMS: clear tags)MM Tag000000000000FPU Tag6313MM7TOS1163ST7ST0StatusWord00000MM0TOS=0Dr. K. Levit-Gurevich,May 31, 2006, p-271Konstantin Levit-GurevichMicrocomputers

Example: Conditional Select / Branch Removal● Packed Comparison(PCMPCC) and thelogical instructions enableconditional selectoperations in parallel andwithout data dependentbranches.PCMPEQWX1=greengreenX Y new_imageX2 != greengreen+ =X3=greengreenX4 !=greengreenif (X[i] != green) thennew_image[i] = X[i]elsenew_image[i] = Y[i]0xFFFF 0x0000 0xFFFF 0x00000xFFFF 0x0000 0xFFFF 0x0000PANDNX1 X2 X4X3PANDY1 Y2 Y4Y3MOVQPCMPEQWMOVQPANDNPANDPORMOVQMM1, XMM1, GREENMM2, MM1MM1, XMM2, YMM1, MM2New, MM1POR Y1 0x0000 Y3 0x0000finished pixels0x0000X20x0000X4Y1 X2 X4Y3Dr. K. Levit-Gurevich,May 31, 2006, p-272Konstantin Levit-GurevichMicrocomputers

Streaming SIMD Extensions●New technology to exploit parallelism in FP and INT applications●Key capabilities♦ Packed Operations♦ Branch Removal/Compression♦ Data Movement/Hints♦ FP/INT Type Conversion●Application Domains…♦ 3D Graphics: geometry, lighting♦ signal processing♦ high precision simulation & modeling♦ video encoding/decoding♦ other apps requiring streaming input and outputDr. K. Levit-Gurevich,May 31, 2006, p-273MicrocomputersKonstantin Levit-Gurevich 273

SIMD FP instructions types●arithmetic●square root●approximation instructions●min & max●loads & stores●move mask●compare & set mask●logical●compare & set eflags●conversionInstruction CategoriesSIMD FP - 4 wide, single-precision, 2 wide double precisionSIMD INT - extensions to MMX technology capabilities,Extending all the MMX instruction to 128 bit using the new XMMregistersCacheability controlState managementDr. K. Levit-Gurevich,May 31, 2006, p-274MicrocomputersKonstantin Levit-Gurevich 274

New SIMD FP Registers•New set of registers!•Direct access•Used for data only•Extended processorstatexmm0xmm1xmm2xmm3xmm4xmm5xmm6xmm7eight 128-bit, 4x32bit single-precision FP2x64bit single-precision FP, 128 bit MMXDr. K. Levit-Gurevich,May 31, 2006, p-275MicrocomputersKonstantin Levit-Gurevich 275

Packed SP Data Type●Each register holds 4 single-precision FP values or 2 double precision●IEEE-754 compatible●Scalar operates on least-significant number12732 31 0xmm01-bit sign8-bit exponent23-bit mantissa1-bit sign10-bit exponent53-bit mantissaDr. K. Levit-Gurevich,May 31, 2006, p-27631302322MicrocomputersKonstantin Levit-Gurevich 2760

Packed Operationsxmm0a3a2a1a0op is one of•addps addpd•subps subpd•mulps mulpd•divps divpdxmm1b3b2b1b0xmm0a3 op b3a2 op b2a1 op b1a0 op b0Dr. K. Levit-Gurevich,May 31, 2006, p-277MicrocomputersKonstantin Levit-Gurevich 277

Scalar Operationsxmm0a3a2a1a0op is one of•Addss addsd•subss subsd•nulss mulsd•divss divsdxmm1b3b2b1b0xmm0a3a2a1a0 op b0Dr. K. Levit-Gurevich,May 31, 2006, p-278MicrocomputersKonstantin Levit-Gurevich 278

SIMD Data Organization●Exploit vertical parallelism●SOA versus AOS♦ Array of Structures♦ Structure of ArraysX0, Y0, Z0, X1, Y1, Z1, X2, Y2, Z2, …Better cacheabilityBetter SIMD calculationsX0, X1, X2, X3, … Y0, Y1, Y2, Y3, … Z0, Z1, Z2, Z3, ...Dr. K. Levit-Gurevich,May 31, 2006, p-279MicrocomputersKonstantin Levit-Gurevich 279

Matrix Vector Multiply example● Typical 3D operation● Load values in SOA format♦ xxxx…, yyyy…, zzzz…● Follow with multiply and add operationsmovaps xmm0, [list+X+ecx] ;load X componentsmovaps xmm2, [list+Y+ecx] ;load Y componentsmovaps xmm3, [list+Z+ecx] ;load Z componentsmovaps xmm1, [esi+m00] ;m00 m00 m00 m00movaps xmm4, [esi+m01] ;m01 m01 m01 m01Loop back and pick up next 4 vertices…mulps xmm1, xmm0 ;x*m00 x*m00 x*m00 x*m00mulps xmm4, xmm2 ;y*m01 y*m01 y*m01 y*m01addps xmm4, xmm1 ;add the 2 resultsmovaps xmm1, [esi+m02] ;load matrix element m02 (x4)mulps xmm1, xmm3 ;z*m02 z*m02 z*m02 z*m02addps xmm4, xmm1 ;add resultsaddps xmm4, [esi+m03] ;add last element of matrixDr. K. Levit-Gurevich,May 31, 2006, p-280MicrocomputersKonstantin Levit-Gurevich 280

SIMD Integer Instructions●Extensions to MMX TM technology instructions●Operate on same 64-bit registers as previous MMX technologyinstructions, Or on the new 128 bit XMM registers●Instructions:♦ extract, insert, min/max, byte mask → integer, multiply high unsigned,shuffle multiply unsigned double, add and sub qword, averageDr. K. Levit-Gurevich,May 31, 2006, p-281MicrocomputersKonstantin Levit-Gurevich 281

Paging

Paging ... View more Paging

Delete template?

Save as template ?

Paging Paging