Professional Documents
Culture Documents
Software Design For Low Power
Software Design For Low Power
Software Design For Low Power
Gokul B S
Sources of SPD
Memory system takes 1/10t to 1/4th fraction of the power
budget. More sensitive DSP power applications such as video processing. System buses with large switching activity. Data paths in ALUs and FPUs. Control logic and clock distribution. Programs energy dissipation is proportional to the number of execution cycles of the program.
Gokul B S 3
power estimation tools. Higher level approach Estimate power based on frequency of execution of instruction sequence. Gate level Power Estimation > Most accurate method available assuming detailed gate level description is available. > Too slow for low power optimization, but more important in evaluating the power dissipation behavior of a processor design.
Gokul B S 4
contd...
> Less precise but much faster > Is implemented in a Power Estimation Simulator called ESP(Early design Stage Power and performance simulator). Bus Switching Activity > Bus activity is assumed to be representative. > Requires knowledge about architecture of processor, opcodes for instruction set, input data to a program, etc.
Gokul B S 5
contd...
With the help of for loop. 1. for ( i=0; i<n; i++) if(i % 2 == 0) sum_even+=i;
2.
sum_even+=i;
Gokul B S
must always be to design an algorithm that maps well to available hardware and is efficient for the problem at hand in terms of both time and storage complexity. Algorithm Computations to match Computational Resources In parallel processor applications, a typical problem is to structure software in a way that maximizes the available parallelism.
Gokul B S
program execution. In low-power DSP synthesis, a typical problem is to design an algorithm to allow a circuit implementation that minimizes power dissipation given throughput and area constraints. Often a low-power DSP design will also exploit parallelism in an algorithm, but the objective is to shorten critical paths so that supply voltages can be lowered while maintaining overall performance.
Gokul B S
If only one adder is available, then Figure 8.1 is a sensible approach. Parallelizing the summation would only force us to use additional registers to store intermediate sums.
Gokul B S
One Adder
Gokul B S
10
in Figure 8.2 makes sense because it permits two additions to be performed simultaneously.
Gokul B S
11
Two Adders
Gokul B S
12
of an algorithm quite so conveniently. However, the principle is still applicable. The basic principle is to try to match the degree of parallelism in an algorithm to the number of parallel resources available.
Gokul B S
13
Gokul B S
14
Gokul B S
15
Gokul B S
16
Gokul B S
17
Gokul B S
18
Reference
1. Low-Power CMOS VLSI Circuit Design, Kaushik Roy and Sharat C Prasad, Wiley Student Edition, 2009 2. http://uploading.com/files/get/39f8aa41/
Gokul B S
19
THANK YOU
Gokul B S
20