Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

MPHS EXAM 13/06/16 – Prof.

Luca Benini Name: Nmat:

EXERCISE I

1. Describe the functionality of the digital circuit below


2. Write the verilog code that implements the circuit below

LOAD
4’b1000

EN
4 4 GREATER
A
FFs >
UPDATE
CLK
RST_N 4 EN
4
+ 4
1 4
FFs
OUT

CLK
RST_N

Solution

The circuit is composed by an input stage made of 4Flip Flops, an elaboration block (comb logic ) and an output stage
(4FFs). On the rising edge of the clock, the input data A is sampled if LOAD==1. Then the Registered data A_reg on the
output of the first stage is elaborated in the comb logic blocks. The comparator checks if the A_reg > 8 and if this
check is true, it set the multiplexer to perform the Accumulation operation in the Outputs stage. If the comparator
check is false, the A_reg is moved in the FFs. Note that, if UPDATE is 1’b0, the last stage FFs are not updated.

The verliog code is listed in the following page


module Mixed_logic
(
input logic clk,
input logic rst_n,
input logic [3:0] A,
input logic LOAD,
input logic UPDATE,
output logic [3:0] OUT
);

logic [3:0] A_reg;


logic GREATER;

always_ff @(posedge clk, negedge rst_n)


begin
if(rst_n == 1’b0)
A_reg <= 4'b0000;
else
if(LOAD)
A_reg <= A;
end

assign GREATER = (A_reg > 4'b1000); // Comparator

always_ff @(posedge clk, negedge rst_n)


begin
if(rst_n == 1'b0)
OUT <= 4'b0000;
else
if(UPDATE)
begin
case(GREATER)// this is the MUX logic
1'b0:
begin // accumulation
OUT <= A_reg + OUT;
end

1'b1:
begin // Simple sampling
OUT <= A_reg;
end
endcase

end

end

endmodule
EXERCISE II

A microcontroller has to acquire the electrical activity of the human heart over a period of time using an analog
sensor placed on the skin. The selected microcontroller has the following possible operative modes:

Modes Selected MCU

(2V - 3V)

200 µA@3V (1MHzMaster Clock – Secondary Clock Up 500KHz,


Normal mode
Ultra low power Clock 32KHz)

150 µA@2V (Master Clock not active, Secondary Clock 500KHz,


Low Power mode 1
Ultra low power Clock 32KHz)

Low Power Mode 2 75 µA@2V ( Only the Ultra low power Clock 32KHz)

Sleep Mode 1.2 uA

The analog signal of the heart has a frequency of around 300Hz and the sensor gives an output in the voltage range
0V- 0.5V

Assuming that the system designed above:

- is supplied by a battery.

- The sensor consumes 1mA @3V in the period the time the ADC is acquiring the data, and 0mA in the time the
microcontroller is in sleep mode.

- The system is sampling continuously the sensor for 10seconds and sleeping for 990seconds.

Compute the following


1. the duty cycle of the application.

2. The power consumption of the system (Microcontroller+Sensor) in the active period (using the normal mode) and
in the sleeping period according to the DC. Afterward, evaluate the energy consumed for a single period T =
Tactive+Tsleep

3. Estimate the lifetime of the system with a 100mAh battery @3.7V.

4. Minimize the energy consumption during the acquisition using one of the two low power modes possible and
comment the decision. Evaluate the lifetime extension and the minimal sampling frequency.
Solution

1) The duty cycling is very easy

 T = Tact + Tsleep -> Duty Cycling D = Tact/T *100%


Where T = 990s+10s; DC = 1%

2)

 Power = Voltage * Current = [W]


Power in acquisition = 600uWMicrocontroller + 3mW sensor; Power in Sleep = 3.6uW + 0mW Sensor

 Energy = Power * time in seconds. = [J]


 Energy duty cycling period T =>E = Pactive*Tact + Psleep*Tsleep

E = 36mJ + 3.5mJ = 39.5mJ

3) Lifetime estimation.

• LifeTime (s) = (Ebattery / Etot ) * T

EBattery = 100mAh * 3600s * 3.7V-> 1320J

Lifetime (s) = (1320J / 39.5mJ ) *1000s = 33417721s => 386 days.

4) The best and only option to reduce the energy consumption is to use the low power mode 1 as the ADC
can be clocked also with the secondary clock during the acquisition. The LPM 2 it is not possible to be used
as all the clock sources for the ADC would be NOT active. Moreover it is possible to use 2V as voltage to
reduce even more the power consumption.

In this condition

Power in acquisition = 300uWMicrocontroller + 3mW sensor; Power in Sleep = 3.6uW

E = 33mJ + 3.5mJ = 36.5mJ

Lifetime (s) = (1320J / 36.5mJ ) *1000s = 36164383s => 418.5 days.


EXERCISE III
1) Consider the following program, which uses the Monte Carlo method to compute the value of π (pi)

main()
{
int i, niter = 1000000;
double x, y, z, pi;
double count; // number of points in the 1st quadrant of unit circle

for (i=0; i<niter; i++) // repeat for a very large number of iterations
{
x = rand(); // Select random value for x
y = rand(); // Select random value for y
z = (x*x)+(y*y);
if (z<=1) count++; // if x^2+y^2 <= 1 this point belongs to the circle
}

pi = count / niter * 4;
}

1a) Describe how the main loop can be parallelized. Explain which data needs to be declared as private and
which as shared and why.

1b) Would you use dynamic scheduling or static scheduling for this loop? Explain why.

1c) Executing this loop in parallel is subject to a race condition. Explain how this can be protected. Could a
reduction clause be used? If so, how?

Solution

1a) count should be declared as shared, since every parallel thread might increase its value. x, y and z are
private to each thread (and their value across iterations should remain independent).

main()
{
int i, niter = 1000000;
double x, y, z, pi;
double count; // number of points in the 1st quadrant of unit circle

#pragma omp parallel for shared (count) private(x,y,z)


for (i=0; i<niter; i++) // repeat for a very large number of iterations
{
x = rand(); // Select random value for x
y = rand(); // Select random value for y
z = (x*x)+(y*y);
if (z<=1) count++; // if x^2+y^2 <= 1 this point belongs to the circle
}

pi = count / niter * 4;
}

1b) Dynamic scheduling is useful when loop iterations contain different amounts of work. In this case, assigning the
same number of iterations to each thread (what happens with static scheduling) might lead to load imbalance. On
the other hand, dynamic scheduling implies higher runtime overhead than static scheduling.
The loop considered in this example has very little variance (the iterations for which the condition (z<=1) evaluate
to true execute one additional increment operation), thus dynamic scheduling might not bring very significant
improvements compared to static scheduling. On the other hand, the loop contains sufficient work to amortize
runtime overheads, thus its use is not harmful.

1c) When executing the loop in parallel several threads might try to update count at the same time. To avoid the
race condition, i) reading count, ii) computing its new value and iii) writing it into memory should be made
atomically (i.e., this sequence of operations must not be interrupted by other threads). The simplest way to do so in
OpenMP is to protect the update with a critical section:

#pragma omp parallel for shared (count) private(x,y,z)


for (i=0; i<niter; i++) // repeat for a very large number of iterations
{
x = rand(); // Select random value for x
y = rand(); // Select random value for y
z = (x*x)+(y*y);
if (z<=1) // if x^2+y^2 <= 1 this point belongs to the circle
#pragma omp critical
count++;
}

The race condition can also be avoided by protecting the update with the reduction clause as follows:

#pragma omp parallel for shared (count) private(x,y,z) reduction(count:+)


for (i=0; i<niter; i++) // repeat for a very large number of iterations
{
x = rand(); // Select random value for x
y = rand(); // Select random value for y
z = (x*x)+(y*y);
if (z<=1) count++; // if x^2+y^2 <= 1 this point belongs to the circle
}

-------------------------------------------------------------------------------------------------------------

You might also like