Model-Specific Registers: A Look at Intel's Scheme For Introducing New CPU Features

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

Model-Specific Registers

A look at Intel’s scheme for


introducing new CPU features
Test Registers
• The 80386 implemented two registers for
testing its Translation Look-aside Buffer
(i.e., the special cache used for speeding
up virtual-to-physical address-conversions
• The registers were named TR6 and TR7
• Intel warned that these system registers
were unique to the 80386 CPU’s design
and might not be present in future chips
Then three more
• The TR6 and TR7 registers were kept in the
80486 design -- along with three extra Test
Registers (TR3, TR4, TR5) that allowed testing
of the processor’s caches for code and data
• Again Intel warned that these registers were
unique to the 80486 CPU’s design and that they
might not be implemented in subsequent chips
• Sure enough, in the 80586 (‘Pentium’) they were
gone – so software written to use them would no
longer execute on the newer Pentium CPUs
The ‘Model-Specific’ concept
• Beginning with the Pentium processor,
Intel has been including ‘experimental’
features in its processors, warning that
they may disappear from future designs,
but providing a standard and permanent
way for all such features to be accessed
• This access is via a pair of ‘privileged’
instructions (rdmsr and wrmsr) that can
only be executed by ‘ring0’ code
Quite a few MSRs now!
• At first there were only about a dozen of
these MSRs (Model-Specific Registers),
but lately their number is well over 200
• Some MSRs have evidently proven to be
sufficiently satisfactory and worth having
that they are now deemed as permanent
fixtures of the defined i386 architecture
The Time-Stamp Counter
• This 64-bit Model-Specific Register was
introduced in the Pentium processor and
has been present in each CPU thereafter
• It increments once every CPU clock-cycle,
starting from 0 when power is turned on
• It won’t overflow for at least ten years
• Unprivileged programs (ring3) normally
can access, it via the rdtsc instruction
Using the TSC
64-bits

63 32 31 0

EDX EAX

time0: .quad 0 # saves starting value from the TSC


time1: .quad 0 # saves concluding value from TSC

# how you can measure CPU clock-cycles in a code-fragment


rdtsc # read the Time-Stamp Counter
movl %eax, time0+0 # save least-significant longword
movl %edx, time0+4 # save most-significant longword
# <Your code-fragment to be measured goes here>
rdtsc # read the Time-Stamp Counter
movl %eax, time1+0 # save least-significant longword
movl %edx, time1+4 # save most-significant longword
# now subtract starting-value ‘time0’ from ending value ‘time1’
The TSC as an MSR
• Each Model-Specific Register has its own
identifying register-number, and can be
accessed (from ring0) using the special pair of
instructions: rdmsr and wrmsr
• The Time-Stamp Counter is MSR number 0x10
• To write a new 64-bit value into the TSC, you
load the desired 64-bit value into the EDX:EAX
register-pair, you put the MSR ID-number 0x10
into register ECX, then you execute wrmsr
IA32_APIC_BASE
• This register has MSR number 0x1B and
it’s private to each CPU in an SMP system
• It establishes the base-address for the
Local-APIC’s memory-mapped registers
(the default base-address is 0xFEE00000,
but that can be changed using this MSR)
• The CPU’s Local-APIC functions can be
either enabled or disabled (via bit #11)
• The BSP can be recognized (via bit #8)
Relocating the APIC registers
IA32_APIC_BASE (64-bits)

63 32 31 12 11 8 0
B
APIC base-address E
reserved N
S
(4K page-number) P

Default-value for APIC base-address page = 0xFEE00

Local-APIC Enable bit (1=enabled, 0=disabled)

Boot-Strap Processor (read-only): 1=yes, 0=no

# make the processor’s Local-APIC registers accessible in real-mode


mov $0x000D8000, %eax # least-significant 32-bits
mov $0x00000000, %edx # most-significant 32-bits
mov $0x1B, %ecx # MSR register-number
wrmsr # write to specified MSR
Extended Feature Enable Register
• The EFER was introduced in conformity
with Advanced Microprocessor Designs
way of implementing 64-bit architecture
• Its MSR register-number is 0xC0000080
IA32_EFER (64-bits)

63 32 31 12 11 10 9 8 0
S
3 3 Y
X 2 2 S
reserved reserved D e e
C
A
L
A E L

eXecute-Disable bit in paging structures (1=enabled, 0=disabled)


IA32e-mode is active (1=yes, 0=no)
Enable IA32e-mode (1=yes, 0=no)
Enable SYSCALL/SYSRET instructions in 64-bit mode (1=yes, 0=no)
Demo: ‘try64bit.s’
• We created a demo-program that shows
what steps are needed to enable the new
64-bit capabilities of recent Pentium-D or
Core 2 Duo processors (using EFER)
• This demo cannot be executed on our
current CS Lab/Classroom workstations,
but it CAN execute on a remote-access
department server named ‘anchor00’
New 4-Level page-tables needed
• For executing in 64-bit mode, the PAE-bit
(Page-Addressing Extensions) must be
enabled (bit #6 in Control Register CR4)
and 4-levels of page-table structures must
be prepared which implement an “identity
mapping” for the transition-code itself
• Then 64-bit mode is entered by turning on
the PG-bit in Control Register 0 (assuming
bit #8 in the EFER register was set to 1)
4-Levels of mapping
63 48 47 39 38 30 29 21 20 12 11 0

sign-extension PML4 PDPT PDIR PTBL offset


64-bit ‘canonical’ virtual address Page
Frame
Page (4KB)
Table
Page
Page Directory
Directory
Page Pointer
Map Table
Level-4
Table

CR3 Each mapping-table contains up to 512 quadword-size entries


Page-Table entry format
63 62 52 51 40 39 32

E Base
reserved
X available Address
(must be 0)
B [39..32]

31 12 11 9 8 0
P PPSR
Base Address [31..12] avail G A D A C W / / P
T D T UW

Legend:
P = present (0=no, 1=yes) PWT = Page Write-Through (0=no, 1=yes)
R/W (0=read-only, 1=writable) PCD = Page Caching Disable (0=no, 1=yes)
S/U (0=supervisor-only, 1=user) PAT = Page-Attribute Table-Index
A = accessed (0=no, 1=yes) G = Global page (1=yes, 0=no)
D = dirty (0=no, 1=yes)
Segment descriptors
• Segment-descriptors and gate-descriptors
have an enlarged format in 64-bit mode to
accommodate the larger-sized addresses
• Segment-Limit and Base are disregarded
for selectors in registers CS, DS, ES, SS
127 64 63 0

Formerly ‘reserved’ bit


is now the ‘L’ bit D
(it indicates a ‘long’ GD L A P P S TYPE
segment-descriptor L
A few GDT descriptors…

.align 16 # octaword-alignment (for optimal access)


theGDT: .octa 0x00000000000000000000000000000000 # null
.equ sel_cs64, (. – theGDT)+0 # code64 selector (ring0)
.octa 0x000000000000000000209A0000000000 # code
.equ sel_cs32, (. – theGDT)+0 # code32 selector (ring0)
.octa 0x000000000000000000409A010000FFFF # code
.equ sel_vram, (. – theGDT)+0 # data16 selector (ring3)
.octa 0x00000000000000000080F20B80000007 # data
You must update ‘binutils’
• You cannot assemble and link programs
that are written for the IA32e 64-bit mode
unless you install the newest versions of
the GNU assembler ‘as’ and the linker ‘ld’
• You can download these utilities from the
website for the Free Software Foundation
at: http://www.fsf.org/
• Directions for installing are easy-to-follow

You might also like