Professional Documents
Culture Documents
ECE 391 Exam 1, Spring 2016 Solutions
ECE 391 Exam 1, Spring 2016 Solutions
Solution: The command on line 14 accesses the blink structure which was passed as an argu-
ment to the system call. Therefore, it is dereferencing a user-level pointer, which is not safe to
do in kernel code. Instead, the user structure should first be copied to kernel memory using
copy_from_user
Points: /2 1 of 17 NetID:
Question 1 continues. . . ECE 391, Exam 1
(b) (3 points) Given the following function prototype, please write x86 assembly code to call this func-
tion? Assume BL=arg1, DX=arg2, ESI=arg3. Don’t worry about saving any registers.
void some func(char arg1, short arg2, int arg3);
(c) (2 points) Given that EAX=1000, EDI=10, what memory address is calculated for the memory
operand 100(%EAX, %EDI, 4)?
(d) (2 points) Given row number is 20, column number is 15, what is the memory offset from the start
of video memory in your mp1?
Solution: This is off the screen because rows are 0–24. (If using rows and columns were
numbered starting with 1, this would still be invalid because column 0 would not exist.)
Points: /9 2 of 17 NetID:
ECE 391, Exam 1
2. Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 points
In MP1, you were required to write a tasklet function to traverse the linked list and update accord-
ingly. Now, a different tasklet function needs to be implemented, and there are two things you need
to accomplish: (1), switch the on char and off char for each mp1 blink struct; (2) calculate the sum
of all countdown fields(dont ask why we want this). Gladly, a recursive helper function ‘traverse’ is
already implemented for you in C, and the return value should contain the sum of all countdown fields
in the given linked list starting at ‘node’. Your task is to implement the ‘do operation’ function in x86
assembly, and translate the ‘traverse’ function to x86 assembly.
/∗ a h e l p e r f u n c t i o n t h a t w i l l be c a l l e d i n t h e new t a s k l e t ,
∗ eg . t r a v e r s e ( m p 1 l i s t h e a d ) ; ∗/
int t r a v e r s e ( struct m p 1 b l i n k s t r u c t ∗ node ) {
i f ( node != NULL) {
return d o o p e r a t i o n ( node)+ t r a v e r s e ( node−>next ) ;
}
return 0 ;
}
/∗ f u n c t i o n s i g n a t u r e f o r d o o p e r a t i o n ∗/
int d o o p e r a t i o n ( struct m p 1 b l i n k s t r u c t ∗ node ) ;
# U s e f u l o f f s e t c o n s t a n t s f o r a c c e s s i n g members o f a
# struct mp1 blink struct structure
LOCATION = 0
ON CHAR = 2
OFF CHAR = 3
ON LENGTH = 4
OFF LENGTH = 6
COUNTDOWN = 8
STATUS = 10
NEXT = 12
STRUCT SIZE = 16
Points: /0 3 of 17 NetID:
Question 2 continues. . . ECE 391, Exam 1
(a) (8 points) Implement the ‘do operation’ function in x86 assembly. It should (1) switch the on char
and off char of the provided node and (2) return the countdown of the provided node as int. For
symplicity, you DO NOT need to check if node is NULL. We provided the equivalent C code below.
Your code should not exceed 15 lines, excluding comments and labels
int d o o p e r a t i o n ( struct m p 1 b l i n k s t r u c t ∗ node ) {
char tmp ;
tmp = node−>o n c h a r ;
node−>o n c h a r = node−>o f f c h a r ;
node−>o f f c h a r = tmp ;
return node−>countdown ;
}
do operation :
Solution:
# s e t up s t a c k frame
pushl %ebp
movl %esp ,%ebp
# g e t node argument
movl 8(%ebp) ,%ecx
# s a v e ON CHAR, OFF CHAR
movb ON CHAR(%ecx) ,% dl
movb OFF CHAR(%ecx) ,%dh
# r e p l a c e them
movb %dh ,ON CHAR(%ecx )
movb %dl ,OFF CHAR(%ecx )
# r e t u r n countdown , ze r o −e x t e n d t o 32 b i t s
movzwl COUNTDOWN(%ecx) ,%eax
# exit function
leave
ret
Points: /8 4 of 17 NetID:
Question 2 continues. . . ECE 391, Exam 1
(b) (12 points) Translate the ‘traverse’ function to x86 assembly. Part of the assembly code is given so
you just need to fill in the blanks. And remember, the return value should contain the sum of all
countdown fields in the given linked list starting at ‘node’.
1 traverse :
2 pushl %ebp
3 movl %esp , %ebp
4
5 #s a v e c a l l e e −s a v e d r e g i s t e r
6 pushl %ebx
7
8 # local variable
9 subl $4 , %esp
10
11 #c h e c k node f o r NULL
12 movl 8(%ebp ) , %ebx
13 cmpl 0 , %ebx
14 je NULL PTR
15
16 #perform t h e o p e r a t i o n
17 pushl %ebx
18 call do operation
19 addl $4 , %esp
20 movl %eax , -8(%ebp )
21
22 #r e c u r s i v e l y c a l l t r a v e r s e
23 movl NEXT(%ebx) , %eax
24 pushl %eax
25 call traverse
26 addl $4 , %esp
27
28 #c a l c u l a t e sum o f countdown
29 addl -8(%ebp) , %eax
30 jmp RETURN
31
32 NULL PTR :
33 xorl %eax , %eax
34
35 RETURN:
36 addl $4 , %esp
37 popl %ebx
38 leave
39 ret
Points: / 12 5 of 17 NetID:
Question 2 continues. . . ECE 391, Exam 1
(c) (8 points) Ben Bitdiddle has found an old page on the Internet, written by someone with the
username psimon41, called 50 Ways to Leave Your Function. He’s getting a bit smarter and realizes
that you can’t trust everything you read on the Internet; can you tell him which versions are going
to be a correct replacement to lines 36–39 in the previous part? For each part, either write “correct”
or explain why the replacement is wrong.
i. # J u s t s w i t c h o u t t h e frame , Wayne
l e a l −4(%ebp) ,% esp
popl %ebx
leave
ret
Solution: Correct
# Clean up t h e s t a c k , Jack
popl %ebx
popl %ebx
leave
ret
Solution: Correct
iii.
ii. # If t h e r e ’ s some space , Chase ,
addl $4 ,%esp
popl %ebx
popl %ebp
ret
Solution: Correct
Solution: Incorrect. The local variable is saved into %ebx, saved %ebx is saved into %ecx, and
saved %ebp is used as a return value.
v. # Return from t h e c a l l , S a u l
movl 4(%esp ) ,%ebx
movl 8(%esp ) ,%ebp
addl $8 ,%esp
ret
Solution: Incorrect. %esp needs to be adjusted by 12, not 8, to clear off saved %ebp, %ebx, and
local variable.
vi. # Don ’ t f o r g e t t o r e s t o r e r e g s !
movl 4(%ebp) ,%ebx
movl %ebp,%esp
popl %ebp
ret
Points: /8 6 of 17 NetID:
Question 2 continues. . . ECE 391, Exam 1
Solution: Incorrect. 4(%ebp) is the return address; to restore %ebx you need to say -4(%ebp).
vii. # Put t h e r e t u r n a d d r e s s , J e s s
movl 4(%esp ) ,%ebx
leave
ret
Solution: Correct
viii. # i n t o EIP !
addl $4 , %esp
popl %ebx
leave
popl %ecx
jmp ∗%ecx
Solution: Correct
Solution: This will work fine most of the time, however, after the first instruction, the value of
%ebx is saved on the part of the stack (numerically) below %esp. If this code runs in kernel mode
and an interrupt occurs, that part of the stack will be overwritten by the saved processor state.
In user mode, this is not an issue because the interrupt will use the kernel stack, but a deliver
of a signal (a user-level analog of an interrupt, discussed later i the course) will have a similar
bad result. In general, you cannot assume that any part of the stack (numerically) below %esp is
preserved for you.
Points: /0 7 of 17 NetID:
ECE 391, Exam 1
3. PIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 points
For this question, you should assume that the master and slave PIC are configured as shown on the
reference sheet at the back of the exam.
(a) (4 points) Consider the following state of the registers on the master and slave PICs. List which
interrupt request lines would, if activated, cause an interrupt to be signaled to the CPU. (E.g., slave
IR2, master IR7)
• Master Interrupt Mask Register (IMR): bits 1 and 5 set
• Master In-service Register (ISR): bit 4 set
• Slave IMR: bits 2 and 3 set
• Slave ISR: bit 5 set
• Master and slave Interrupt Request Register (IRR): all bits clear
Solution: IR0 and IR3 on the master (IR1 is masked, IR4 is in service, and IR5–IR7 are lower
priority).
IR0, IR1, and IR4 on the slave (IR2 and IR3 are masked, IR5 is in service, and IR6,IR7 are
lower priority).
Note: The question did not intend to ask about IR2 on the master since it’s hooked up to the
slave, rather than a device, but technically if IR2 on the master is raised, it will generate an
interrupt, so we will accept master IR2.
A lot of people thought that any interrupt being in service on the master would preempt IR3–
IR7 on the master. The mechanism that makes this work is the in-service bit on IR2 on the
master, which is set after a slave interrupt (see next part), but is not the case here.
(b) (12 points) Assume that all registers on the master and slave are set to 0. Walk through the steps
that happen after an interrupt is signaled on IR4 on the slave, using the potential steps on the
next page, up to (and including) the execution of the mask_and_ack function. (We’ve reproduced
a simplified version of the function below for your reference.)
#define MASTER CMD 0 x20
#define MASTER IMR 0 x21
#define SLAVE CMD 0xA0
#define SLAVE IMR 0xA1
Points: / 16 8 of 17 NetID:
Question 3 continues. . . ECE 391, Exam 1
Solution:
Note: steps 7–11 are essentially simultaneous and so their order was not graded. The order of
5 was also fairly flexible.
Points: /0 9 of 17 NetID:
Question 3 continues. . . ECE 391, Exam 1
(c) (3 points) The version of mask_and_ack above is a little too simplified. Explain what is missing
and describe the impact it will have on interrupt processing.
Solution: The function does not send an EOI to the master, which means that the ISR for
IR2 remains on. This will prevent any interrupts from the slave, as well as IR3–IR7 on the
master, from ever being delivered to the CPU.
Note: some people mentioned the lack of spinlocks. We will give credit for this answer only if you
provide a correct explanation of what the impact of the missing spinlock is: two simultaneous
mask_and_ack invocations could both read the IMR and then write the updated mask back to
IMR, resulting in one of the bits not being masked.
(d) (2 points) Suppose that the device on IR4 slave line signals another interrupt. When would the
interrupt have to arrive in your sequence in the part above for it to be missed? (I.e., your answer
should be in the form of “before step 6.”)
Solution: Before step 11, when the slave resets bit 4 of IRR to 0. If the second interrupt
occurs after this step, the IRR bit will get set back to 1 and the next interrupt will get signaled
after the corresponding ISR and IMR bits get cleared.
Points: /5 10 of 17 NetID:
ECE 391, Exam 1
y = baz(x)
z = bar(y, x)
w = foo(z, y, x)
To test his creation, Ben wrote a series of small subroutines that are invoked by the magic subroutine.
The code is given on this page and the next. The following C declarations correspond to the subroutines
that are given in assembly.
Implementations of all of the functions are given below and on the next page.
pushl %eax
call ∗12(%ebp , %ebx , 4 )
return to apply :
decl %ebx
jmp more
done :
leave
ret
Points: /0 11 of 17 NetID:
Question 4 continues. . . ECE 391, Exam 1
(a) (8 points) Suppose that a breakpoint is set on the first instruction in print stack. Now, a test
program calls magic(16). Complete the tables below by writing the values on the stack and the
value of EBP at the time the breakpoint is reached. Use labels instead of numbers where possible.
The label return to magic points to the instruction that immediately follows the call instruction
inside magic.
Hint: You do not need to understand every function to complete the table.
0xffffd968 0xffffd9c8
Points: /8 12 of 17 NetID:
Question 4 continues. . . ECE 391, Exam 1
(b) (2 points) Ben’s program crashes when magic is called from some places in the program but not
others. What mistake did Ben make in apply?
Solution:
EBX is used by apply but not saved and restored.
Partial credit was only given if the correct answer was followed by an incorrect statement.
A lot of people thought that the stack could overflow but that cannot occur when magic is
called because it has a fixed list of arguments.
(c) (2 points (bonus)) Bonus: What is the return value of magic(16)? Write the value of EAX each
time the processor reaches return to apply for full credit. Express your answer in hexadecimal.
Hint: Continue the work you did in Part A.
Solution:
0x8
0x7
0xc
0xc
0xc0c011
0x380
0xece391
Points: /2 13 of 17 NetID:
ECE 391, Exam 1
5. Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 points
(a) (10 points) Below are some examplesystem calls that run in kernel space, K1–K5. Some of these
system calls also use a helper function F1. Finally, I1 is an interrupt handler. These different parts
all access some shared variables indicated by G1-G5. You must add spinlocks to the code below
to guarantee safety. A large number of locks have been declared and initialized for you. They
are accessible by global pointers L1, L2, L3, etc. You You should assume that the code below is
complete. For example, F1 will not be called in other places, so you should optimizes with respect
to the shown code below. You can also assume that the system calls are running with interrupts
enabled.
In addition to ensuring the safety of the code, you are also required to ensure that the maximum
amount of parallelism can be achieved with your locking system. While doing this, you must use the
minimum number of locks possible while still achieving safety and maximum parallelism. Finally,
you should select the appropriate primitive (spin_lock or spin_lock_irqsave) for each use. (Note
that in class we said that it’s always safe to use spin_lock_irqsave, but here we want you to pick
the primitive that does the minimum work necessary.)
/∗ s h a r e d v a r i a b l e s ∗/
int G1 , G2 , G3 , G4 , G5 ;
/∗ s p i n l o c k s ; assume t h e y ’ r e i n i t i a l i z e d p r o p e r l y ∗/
s p i n l o c k t L1 , L2 , L3 , L4 , L5 , L6 , L7 , L8 ;
/∗ some macros t o s a v e s p a c e ∗/
#define SL ( l o c k ) s p i n l o c k (& l o c k )
#define SU( l o c k ) s p i n u n l o c k (& l o c k )
#define SLI ( l o c k , f l a g s ) s p i n l o c k i r q s a v e (& l o c k , f l a g s )
#define SUI ( l o c k , f l a g s ) s p i n u n l o c k i r q r e s t o r e (& l o c k , f l a g s )
void K1( ) {
SL(L1);
G1++;
G4++;
SU(L1);
}
void K2( ) {
int flags;
SLI(L2,flags);
G2++;
F1 ( ) ;
SUI(L2,flags);
}
Points: / 10 14 of 17 NetID:
Question 5 continues. . . ECE 391, Exam 1
void K3( ) {
SL(L1);
SL(L3);
G1++;
G3++;
G4++;
SU(L3);
SU(L1);
}
void K5( ) {
SL(L3);
G3++;
SU(L3);
}
void F1 ( ) {
G5++;
void I 1 ( ) {
SL(L2);
G2++;
G5++;
SU(L2);
}
Points: /0 15 of 17 NetID:
Question 5 continues. . . ECE 391, Exam 1
(b) (5 points) Now consider that there are 5 threads T1–T5 (unrelated to part (a)) that use the fol-
lowing locking order.
T1 T2 T3 T4 T5
on (L4) on (L1) on (L4) on (L6) on (L5)
on (L3) on (L2) on (L3) on (L5) on (L6)
on (L2) on (L3) on (L2) on (L4) off(L6)
on (L1) on (L4) on (L1) on (L3) off(L5)
off(L1) off(L4) off(L4) off(L3) ——-
off(L2) off(L3) off(L3) off(L4) ——-
off(L3) off(L2) off(L2) off(L5) ——-
off(L4) off(L1) off(L1) off(L6) ——-
Below mark what thread PAIRS can exhibit deadlock by filling in the chart with a “D”. Leave
safe combinations blank.
T1 T2 T3 T4 T5
T1 XX D
T2 XX XX D D
T3 XX XX XX
T4 XX XX XX XX D
T5 XX XX XX XX XX
(c) (3 points) Recall that spin_lock_irqsave calls CLI first and then locks the lock. Now consider
the function spin_lock_irqrestore: does it matter in which order it restores flags and unlocks
the lock? Give a brief reasoning to justify your answer.
Solution: It must release the lock first before restoring flags. Otherwise it will be holding the
lock with the interrupts enabled; if an interrupt occurs that tries to acquire the same lock, a
deadlock will occur.
Points: /8 16 of 17 NetID:
Question 5 continues. . . ECE 391, Exam 1
(d) (3 points) Recall that the lock prefix makes any x86 instruction atomic. Is the code below a working
implementation of a spinlock? Explain why or why not.
1 spin lock :
2 l o c k movl 4(% e s p ) , %eax
3 loop :
4 l o c k movl (%eax ) , %edx
5 lock test %edx , %edx
6 lock jnz loop
7 l o c k movl $1 , (%eax )
8 lock ret
Solution: The lock prefix ensures that each individual instruction is atomic. However, one
thread could execute instruction 4, reading the lock value as 0, after another thread executes
the same instruction, but before the second thread writes 1 to the lock in instruction 7, which
would leave both threads to assume they have the lock.
(e) (2 points) In general, why don’t we use the lock prefix in front of every single instruction?
Solution: The prefix locks the memory bus, which prevents other memory instructions from
executing at the same time; this significantly slows down execution.
Points: /5 17 of 17 NetID: