Professional Documents
Culture Documents
LKK
LKK
LKK
Goals
Hardware-portable
Used to support MIPS, PowerPC and Alpha Currently supports x86, ia64, and amd64 Multiple vendors build hardware
Software-portable
POSIX, OS2, and Win32 subsystems
OS2 is dead POSIX is still supportedseparate product Lots of Win32 software out there in the world
Goals
High performance
Anticipated PC speeds approaching minicomputers and mainframes Async IO model is standard Support for large physical memories SMP was an early design goal Designed to support multi-threaded processes Kernel has to be reentrant
Process Model
Threads and processes are distinct Process:
Address space Handle table (Handles => file descriptors) Process default security token
Thread:
Execution Context Optional thread-specific security token
Tokens
Who you arelist of identities
Each identity is a SID
Object Manager
Uniform interface to kernel mode objects. Handles are 32bit opaque integers Per-process handle table maps handles to objects and permissions on the objects Implements refcount GC
Pointer counttotal number of references Handle countnumber of open handles
Object Manager
Implements an object namespace
Win32 objects are under \BaseNamedObjects Devices under \Device
This includes filesystems
WaitForSingleObject(), WaitForMultipleObjects()
Wait for something to happen Can wait on up to 64 handles at once
Security Descriptors
Each object has a Security Descriptor
Ownerspecial SID, CREATOR_OWNER Groupspecial SID, CREATOR_GROUP DACL
Discretionary Access Control List List of SIDs and granted or denied access rights
SACL
System Access Control List List of SIDs and access rights to be audited
Access Rights
typedef struct _ACCESS_MASK { USHORT SpecificRights; UCHAR StandardRights; UCHAR AccessSystemAcl : 1; UCHAR Reserved : 3; UCHAR GenericAll : 1; UCHAR GenericExecute : 1; UCHAR GenericWrite : 1; UCHAR GenericRead : 1; } ACCESS_MASK;
Security Use
Objects are referred to via handles Security checks occur when an object is opened
Open requests contain a mask of requested access rights If granted to the token by the DACL, the handle contains those access rights
Object Open
evt = OpenEvent(EVENT_MODIFY_STATE, FALSE, "SomeName"); Finds the event object by name Walks the DACL, looking for token SIDs Keeps looking until all permissions are granted If access is granted, inserts a handle to the object into the processs handle table, with EVENT_MODIFY_STATE access
Object Use
SetEvent(evt);
SetEvent() requires EVENT_MODIFY_STATE access, and an event object. The kernel looks up the handle in the processs handle table. Checks to make sure that it maps to an event object, and that the granted access bits contain the EVENT_MODIFY_STATE bit. If all is good, the event is set.
Object Use
WaitForSingleObject(evt)
WaitForSingleObject() requires a synchronization object (like an event) and SYNCHRONIZE access. evt maps to an event object SYNCHRONIZE access was not requested when the handle was inserted. Even if the DACL permits it, the wait fails.
Types of Objects
Events
State is set or clear. Can clear when a wait completes (auto-reset)
Mutexes
Can be acquired by a single thread at a time. Automatically release when owner exits.
Semaphores
Maintain a count Waits decrement the count
More objects
Threads, Processes, Timerslike events Registry Keys
Manipulate data in the registrycentralized store of system configuration info.
LPC Ports
Fast local RPC Security tokens can transfer over LPC calls
Files
Files & IO
File objects maintain a current offset, and a pointer to the underlying stream. Default internal model is asynchronous
Synchronous IO just waits for the IO to complete Async IO can set an event, or run a callback in the thread which queued the IO, or post a message to an IO completion port.
IRPs
Maintain state of IO requests, independent of the thread working on the IO IRPs are handed off through the device stack to their destinations
Threads process IRPs Initiating thread processes the IRP until a device returns STATUS_PENDING Subsequent processing can be done in kernel worker threads
Interrupts
IRQLInterrupt Request Level:
0 => PASSIVE_LEVEL
Processor is running threads All usermode code is at IRQL 0
Interupts
3-26 => Device Interrupt Service Routines
Device interrupts are mapped to an IRQL and an interrupt service routine; ISR is called at that IRQL
Interrupts
Hardware signals an interrupt Interrupts ISR runs at device IRQL
Has to be fast; get off the processor and allow other ISRs to run Typically queues a DPC, acknowledges the interrupt, and returns
IO Completion
Driver calls IO Manager to complete the IRP IO Manager queues a kernel mode APC to the initiating thread APC: Asynchronous Procedure Call
Kernel mode APC preempts thread execution Writes data back to user mode in the context of the thread which initiated the IO Signals completion of the IO
IO Cache
Classic: block cache
Page mappings translate directly to blocks on the underlying partition.
Virtual Memory
Sectionsanother object type
Can be created to map a file Can also be created off the pagefile Optionally named, for shared memory
Reservation
Range of VA which will not be handed out for some other purpose
Committed
VA which actually maps to something
Aside: CreateProcess
Just a user mode Win32 API { NtCreateFile(&file, szImage); NtCreateSection(&sec, file); NtCreateProcess(&proc, sec); NtCreateThread(&thrd, proc); } WaitForSingleObject(proc);
Virtual Memory
Memory Manager maintains processorspecific page table entry mappings.
Some parts of the address space are shared between processesfor instance, the kernels address space and the per-session space.
On a pagefault, mm reads in the data Pages can be mapped without the appropriate access what to do?
Signals
With threads, signals dont work very well. Some software designs expect to touch inaccessible memory.
Large structured files Concurrent garbage collection SLists
Single global handler has to somehow know about all possible situations.
try/finally
res = AllocateSomeResource(); try { SomeOperation(res); } finally { if (AbnormalTermination()) { FreeSomeResource(res); } } return res;
try/except
try { SomeOperationWhichMayAV(); } except (Filter( GetExceptionCode(), GetExceptionInformation())) { DoSomethingElse(); }
try/except
GetExceptionCode()
A code indicating the cause of the exception
GetExceptionInformation()
Additional code-specific info The full processor context
Faultrep.dll
faultrep.dll offers to report the failure back to Microsoft We analyze the failures
A significant number are recognized instantly; we can tell the user what happened and how to fix it. The others go through the standard triage process; developers analyze the dumps and figure out what happened.
OCA
67 million machines running XP Tens of thousands of drivers Over 100 drivers on any given machine One bug in one driver => Crash A significant number of crashes come from third-party drivers (some of which ship on the CD) Lots of different problems, though
Driver Verifier
Controlled by verifier.exe Special-pools allocations
Detects allocation overruns & use after free
Stress
Every night, a couple hundred machines run stress on the latest build Stress exercises filesystems, memory, GUI, scheduler, &c, trying to uncover lowmemory handling problems and race conditions Every morning, the stress test team triages failed machines Developers debug the failures
Questions?