Professional Documents
Culture Documents
Comparative Analysis of Different Clock Gating Techniques
Comparative Analysis of Different Clock Gating Techniques
Comparative Analysis of Different Clock Gating Techniques
Abstract:
In the design of ICs, power dissipation is an important parameter that indicates the need of Low
Power circuits in modern VLSI design. In IC chip design various techniques invented for low
power design. In several techniques Clock gating is one of widely used technique, which
provides very effective solutions for reduction of dynamic power dissipation. Many researchers
are modified clock gating techniques in many different ways. This paper included comparative
analysis of power in Clock Divider circuit using different clock gating techniques.
CHAPTER 1
INTRODUCTION
Before 1990-92, designers only focused to design integrated circuits (ICs) at lowest area but after
it designers also aware to power consumption in IC. In nowadays, every circuit has to face issue
of the power consumption. To avoided that issue in IC design for portable devices. Many ideas
were proposed for improving power dissipation in electronic circuits. Chip power is expended by
clocking system of timing elements in IC design such as Flip-flops, latches and clock networks.
In IC design there are 3 main factors having an important role i.e. Area, Power, and Delay.
The better optimization technique typically minimizes the power consumption. Performance
maximization and Power minimization design optimizations achieved tradeoffs between the
In IC design Dynamic power is achieved by switching activities of circuit. In clock circuit clock
power is main component for dynamic power consumption. So clock gating technique used to
reduce clock power in the circuit. Clock gating is reduced extra clock switching in the circuit.
CLOCK GATING:
Clock gating is proved efficient approach for low power design in IC technology. It is invented
in 90’s for reducing clock switching in the circuit. Clock gating technique requires to extra clock
logic to generate the clock signal which used to control the logic cell. This clock gated signal
enabled only when attend logic 0 or 1. This technique reduces the power as well as area of the
circuit.
The main objective of clock gating technology is reducing the unwanted switching activity
through clock pulses when they are not in use. In flip-flop circuit this activity achieved when
logic changes from 0 to 1 or 1 to 0, as switching activity increased more power consumed by the
circuit. In registers their toggling condition achieved high power dissipation. To implement clock
gating technique first find to best place to reduce the major power consumption. Then create
logic at particular place to clock enabling signal. Clock enabling signal is find there is no activity
for particular unit. That unit is blocked when no need of clock pulses for that circuit. By this
scheme dynamic power is reduced and also reduced the area of the circuit.
In figure-1, shows a simple clock gating scheme with individual clock and enabling signal for
processing. The gate based clock gating logic is shown in figure, there are clock and enable
signals given to the external clock circuitry and then further the gated clock signal given to the
logic circuit for further processing. This scheme increases the area as well as power consumption
In circuit a simple AND gate based clock gating proposed which used only one clock and enable
signal with AND gate and this gated clock signal further given to the logic. Some of most
Basic principle of clock gating is that if a circuit requires a clock pulse, the clock enabling signal
becomes 1 then the clock pulse is allowed to that particular circuit logic block. There is no need
of clock pulse when enabling signal is becomes 0, so blocked that particular logic block. In this
paper we have implemented the clock divider circuit using different clock gated schemes and
compare the results of each to show that which scheme is better for circuit design.
CHAPTER 2
LITERATURE SURVEY
We present a new approach for the cell selection problem based on a resource sharing
updates. For the convex continuous gate sizing problem, we can prove fast polynomial
running times. This theoretical result also gives some justification to previous heuristic
multiplicative weight update methods. For the discrete cell selection problem, where voltage
thresholds can also be chosen, we employ the new algorithm heuristically and achieve
superior results on industrial benchmarks compared with one of the previously best known
algorithms, and competitive results on the ISPD 2013 benchmarks. Finally, we demonstrate
Clock gating is one of the most popular techniques used in many synchronous circuits for
reducing dynamic power dissipation and it is helpful for decreasing the ability of power
wasted by digital circuits. The major power consuming in electronics product is the systems
clock signal and it is responsible for transition state of the components and this typically
leads to the switching power consumption. Several techniques are used to reduce power some
of the developed techniques are power gating, clock gating, adiabatic logic etc,. In clock
gating technique, the unwanted clock signal is deactivated or blocked and by this activity,
low dynamic power consumption can be achieved. Summary of issues associated with above
clock gating techniques are presented for the research community to enable them to take
Technique:
Now a days DC power supply plays very important role in the Electronic industry because
for every electronic gadget DC power is required to operate it. Even though durable DC
batteries are available in the market to operate the various electronic gadgets for more time,
electronic designers are continuously concentrating more and more to reduce the power
through the various new Technologies like increasing parallel operations, pipe line concepts
[1] etc. To work such durable batteries more duration than the actual duration what they can
give, in this work we are concentrating on the 'clock-gating' technique to reduce the power in
the general purpose microprocessor. For every microprocessor clock is required. All
operations of any processor are performed by the clock cycle. There are various blocks in the
processor but all the blocks are not operated at a time while using it, some blocks in the off
mode while other blocks are in the working mode. Hence in order to power off such blocks
for a little while clock gating is used in this work. Wherever particular block is not operated,
for that block clock is disabled by the clock gating technique. The main principle of clock
getting is nothing but ANDing the processor clock with a gate-control signal.
4. A 90 nm leakage control transistor based clock gating for low power flip flop
applications:
The continuous growing demand of portable battery-powered electronics devices hunts for
Nano-electronic circuit design for ultra-low power applications by reducing dynamic power,
static power and short circuit power. In sequential circuit elements of an IC, a notable amount
of power dissipation occurs due to the rapid switching of high frequency clock signals, which
do not fetch any data bit or information. The needless switching of clock, during the HOLD
phase of either `logic 1' or `logic 0', may be abolished using gated clock. In this paper, we
have presented a new clock gating technique incorporating Leakage Control Transistor. The
improvised technique is employed to trigger a D-Flip Flop using 90nm PTM technology at
1.1V power supply. We have observed an impressive reduction in power, delay and latency
using the proposed gating logic, which has outsmarted the existing works. The simulation is
also performed in smaller technology nodes such as 65nm, 45nm and 32 nm to notice the
Battery-powered and hand-held devices such as laptop computers and cell phones have
improved our daily life greatly. It is required that the hardware system have to be fast and
multifunctional. But for the purpose of portability, the minimum of battery weight and the
consumption) are required. Therefore, it is urgent for designers to develop circuits and
systems that use less energy without greatly sacrificing the performance.
Power optimization of a processor can be fulfilled at many different levels of the design
hierarchy with different method. For example, Dynamic Voltage Scaling (DVS) at system
level, bus-coding at algorithm level, clock-gating and operand isolation at register transfer
level, transistor sizing and threshold voltage scaling at circuit & transistor level. Power
optimization of a processor can also be implemented on all components of it. Arithmetic and
Logic Unit (ALU) which takes operands from register file, data memory or ALU write-back
bus is one of these components. The basic structure of ALU is showed in Figure 1. As is
clocked at the highest speed and is kept busy almost 100% of the time, ALU is one of the
most power hungry components in processor and is often the possible location of hot spots
[1]. Therefore, low power design of ALU can considerably reduce the total power
consumption of a processor.
An ALU combines a variety of arithmetic and logic operations into a single unit. For
examples, a typical ALU might perform additional, subtraction, AND, OR, and XOR
operations. Since the architecture of ALU has several implications on power consumption,
delay, and area, then how to organize the operations is a problem. In this paper, we are
mainly concern with the power consumption of the ALU. Hence proper choice of ALU
architecture is needed when the design is targets for low power dissipation.
ALU DESIGN:
By researching on the instructions of processors, we find that all the instructions ALU performs
can be accomplished through basic operations such as Addition, Subtraction, And, Or, Not, Xor
and Clear. In this paper, we are concentrating on the effects of ALU architectures have on power
consumption. Therefore, we design an ALU that is 8-bit width and capable of the basic
path [4]. Therefore, adders have received a lot of attentions from researchers. The Ripple Carry
Adder (RCA) is the earliest and the most fundamental adder. It is O(n)time, O(n) area adder. The
Carry Look-ahead Adder (CLA) becomes popular due to its speed and modularity. It is O(log n)
time, O( n log n) area adder. Besides, the power consumption of CLA is lower than CSA, CSL
est. [7].
As we all know, the relationship of the inputs and outputs of an adder can be expressed as
follows:
Here, G stands for carry generation, P stands for carry propagate, C out stands for output of
carry, Cin stands for the carry in from the lower bit, and S stands for the sum output. In an N-bit
adder, there exists the relationship Cout,k = !CAk,Bk,Cout,k-l) . The CLA speeds up the addition
process by eliminating the ripple delay. Then the C out expression of CLA can be developed as:
For the architecture described in formula (5) is effective on N less then or equal 4 [7], therefore
we design the 8-bit adder of the ALU with the adder scheme which uses internal carry look
ahead with 4- bit slice, and ripple carry across blocks. Figure 2 shows the 8-bit carry look ahead
adder.
A. complex structure:
In complex structure, we can accomplish the logic operation unit by modifYing the P and G
block of the adder. The modified P and G block of complex structure is showed in Figure 3.
Signal c is the signal which decides the ALU performs arithmetic or logic operation. For the
8-bit adder of the ALU is 4-bit CLA internal and ripple carry across blocks, the 8-bit ALU is
In this structure, we have an individual block performs logic operations and a CLA adder
performs arithmetic operations. The logic operation block is showed in Figure 5 and the
adder independent structure is showed in Figure 6. For low power consideration, we adopt
the operand isolation technology. Two AND gates are added after the two operands as the
C. Chain structure
According to [6], chain structure ALU has smaller area and potentially faster than tree
structure. And for the chain structure, placing a functional component differently in the chain
structure may cause different power consumption. Then, we apply chain structure of different
functional components placements to benchmark Dhrystone, and get the most power efficient
are added after the two operands as the broken line frame showed in Figure 7.
CHAPTER 4
PROPOSED DESIGN
In this paper clock divider circuit is designed with all effective clock gating techniques.
Clock divider circuit is used to divide clock. For operating a sequential circuit a clock signal
is needed as a function. A clock divider is generally defined as a functional device. The clock
divider used an input clock as input function and produced an output clock to corresponded
function. The output clock function is defined as the result of the input frequency divided by
an integer. Another name of the clock divider circuit is known as pulse divider circuit. After
receiving gate pulses at input of clock divider, the circuit only passes a fraction of the pulses
In this paper Clock divider circuit is implemented with clock gating techniques for low
power design. Here also introduced clock divider circuit using various effective clock gated
designs. The experimental setup is done with Model-Sim simulator and LogiSim simulator.
There is compared the results with each technique. The RTL schematics and circuit diagrams
of each circuit is shown as below. The simulated results of clock divider circuit with
implementation of each clock gated design are described as below as output waveforms to
corresponded input signals. A clock divider circuit without implementation of clock gating
technique is shown in fig as below. Simply two inputs reset and clock provided to input
nodes and one output as clock_out taken at output node. Two signal count and temp is
After simulation the analysis of results is taken in terms of waveforms, which shown in fig as
below. The waveforms shown the results of clock divider circuit corresponded input signals.
Fig.3 Simulation Result of Clock Divider Circuit
Now we implement the circuit using clock gated designs. The latch based clock gating design
is consider an external clock gating circuitry with D latch and an AND gate for generate the
gated signal. This gated signal is provided to clock divider circuit which processed for low
power design.
The flip-flop based clock gating technique is implementing on clock divider circuit in this
section. This design is considered a D-flip-flop to generate a gated clock and this gated clock
signal is provided to clock divider circuit as input signal to process the output.
Fig.7 Clock Divider using Flip-flop based CG
A result of flip-flop based clock gating is shown as below. The waveform result shows the
Gate based clock gating technique is implemented on clock divider circuit. There is AND
gate is used for generating the gated clock signal. RTL schematic of design is shown in
figure as below.
The result analysis of output signal to corresponded input signal of design is shown in figure
as below.
Fig.12 Simulation results of Clock Divider with Gate based CG
AGCG is new implementation with clock gating technique; by using master and slave flip-
flops the gated signal is generated. There is implementation using AGCG scheme and RTL
The results of AGCG technique is simulated with corresponded input signal which shown as
below. The waveforms analysis has been done through corresponded input signals.
Clock divider circuit is also implemented using LACG technique shown in this section.
Fig.16 Clock Divider circuit using LACG
The simulation results are shown in figure as below. Simulation has been done with
XILINX Software
Xilinx Tools is a suite of software tools used for the design of digital circuits implemented
using Xilinx Field Programmable Gate Array (FPGA) or Complex Programmable
Logic Device (CPLD). The design procedure consists of (a) design entry, (b) synthesis and
implementation of the design, (c) functional simulation and (d) testing and verification.
Digital designs can be entered in various ways using the above CAD tools: using a schematic
entry tool, using a hardware description language (HDL) – Verilog or VHDL or a
combination of both. In this lab we will only use the design flow that involves the use of
VerilogHDL.
The CAD tools enable you to design combinational and sequential circuits starting with
Verilog HDL design specifications. The steps of this design procedure are listed below:
A Verilog input file in the Xilinx software environment consists of the following segments:
All your designs for this lab must be specified in the above Verilog input format. Note that the
state diagram segment does not exist for combinational logic designs.
In this lab digital designs will be implemented in the Basys2 board which has a Xilinx
Spartan3E
–XC3S250E FPGA with CP132 package. This FPGA part belongs to the Spartan family of
FPGAs. These devices come in a variety of packages. We will be using devices that are
packaged in 132 pin package with the following part number: XC3S250E-CP132. This
FPGA is a device with about 50K gates. Detailed information on this device is available at
the Xilinx website.
3. Creating a NewProject
Xilinx Tools can be started by clicking on the Project Navigator Icon on the Windows
desktop. This should open up the Project Navigator window on your screen. This window
shows (see Figure 1) the last accessed project.
Figure 1: Xilinx Project Navigator window (snapshot from Xilinx ISE software)
Select File->New Project to create a new project. This will bring up a new project window
(Figure 2) on the desktop. Fill up the necessary entries as follows:
Figure 2: New Project Initiation window (snapshot from Xilinx ISE software)
Project Location: The directory where you want to store the new project (Note: DO
NOT specify the project location as a folder on Desktop or a folder in the Xilinx\bin
directory. Your H: drive is the best place to put it. The project location path is NOT to
have any spaces in it eg: C:\Nivash\TA\new lab\sample exercises\o_gate is NOT to be
used)
Figure 3: Device and Design Flow of Project (snapshot from Xilinx ISE software)
For each of the properties given below, click on the ‘value’ area and select from the list
of values that appear.
o Device Family: Family of the FPGA/CPLD used. In this laboratory we will
be using the Spartan3EFPGA’s.
o Device: The number of the actual device. For this lab you may enterXC3S250E
(this can be found on the attached prototyping board)
o Package:Thetypeofpackagewiththenumberofpins.TheSpartanFPGAusedin
this lab is packaged in CP132package.
o Speed Grade: The Speed grade is“-4”.
o Synthesis Tool: XST[VHDL/Verilog]
o Simulator: The tool used to simulate and verify the functionality of the
design. Modelsim simulator is integrated in the Xilinx ISE. Hence choose
“Modelsim-XE Verilog” as the simulator or even Xilinx ISE Simulator can
beused.
o Then click on NEXT to save theentries.
All project files such as schematics, netlists, Verilog files, VHDL files, etc., will be stored in
a subdirectory with the project name. A project can only have one top level HDL source file
(or schematic). Modules can be added to the project to create a modular, hierarchical design
(see Section 9).
In order to open an existing project in Xilinx Tools, select File->Open Project to show the
list of projects on the machine. Choose the project you want and click OK.
In this lab we will enter a design using a structural or RTL description using the Verilog
HDL. You can create a Verilog HDL input file (.v file) using the HDL Editor available in the
Xilinx ISE Tools (or any text editor).
Select Verilog Module and in the “File Name:” area, enter the name of the Verilog source
file you are going to create. Also make sure that the option Add to project is selected so that
the source need not be added to the project again. Then click on Next to accept the entries.
This pops up the following window (Figure 5).
Figure 6: Define Verilog Source window (snapshot from Xilinx ISE software)
In the Port Name column, enter the names of all input and output pins and specify the Direction
accordingly. A Vector/Bus can be defined by entering appropriate bit numbers in the MSB/LSB
columns. Then click on Next> to get a window showing all the new source information (Figure
6). If any changes are to be made, just click on <Back to go back and make changes. If
everything is acceptable, click on Finish > Next > Next > Finish tocontinue.
Figure 7: New Project Information window(snapshot from Xilinx ISE software)
Once you click on Finish, the source file will be displayed in the sources window in the
Project Navigator (Figure 1).
If a source has to be removed, just right click on the source file in the Sources in Project
window in the Project Navigator and select Removein that. Then select Project -> Delete
Implementation Data from the Project Navigator menu bar to remove any relatedfiles.
The source file will now be displayed in the Project Navigator window (Figure 8). The
source filewindowcanbeusedasatexteditortomakeanynecessarychangestothesourcefile.All
The input/output pins will be displayed. Save your Verilog program periodically by selecting the
File->Save from the menu. You can also edit Verilog programs in any text editor and add them
to the project directory using “Add Copy Source”.
Figure 8: Verilog Source code editor window in the Project Navigator (from Xilinx ISE
software)
A brief Verilog Tutorial is available in Appendix-A. Hence, the language syntax and
construction of logic equations can be referred to Appendix-A.
The Verilog source code template generated shows the module name, the list of ports
and also the declarations (input/output) for each port. Combinational logic code can
be added to the verilog code after the declarations and before the endmodule line.
For example, an output z in an OR gate with inputs a and b can be described
as, assign z = a | b;
Remember that the names are case sensitive.
A given logic function can be modeled in many ways in verilog. Here is another
example in which the logic function, is implemented as a truth table using a case statement:
moduleor_gat
e(a,b,z); input
a;
inp
ut
b;
out
put
z;
reg z;
always
@(a or b)
begin
case
({a,b})
00: z
=1'b0;
01: z =1'b1;
10: z =1'b1;
11: z =1'b1;
endcase
end
e
ndmo
dule
Suppose we want to describe an OR gate. It can be done using the logic equation as shown in
Figure 9a or using the case statement (describing the truth table) as shown in Figure 9b.
These are just two example constructs to design a logic function. Verilog offers numerous
such constructs to efficiently model designs. A brief tutorial of Verilog is available in
Appendix-A.
Figure 9: OR gate description using assign statement (snapshot from Xilinx
ISE software)
Figure 10: OR gate description using case statement (from Xilinx ISE software)
The design has to be synthesized and implemented before it can be checked for correctness,
by running functional simulation or downloaded onto the prototyping board. With the top-
level Verilog file opened (can be done by double-clicking that file) in the HDL editor
window in the right half of the Project Navigator, and the view of the project being in the
Module view , the implement design option can be seen in the process view. Design entry
utilities and Generate Programming File options can also be seen in the process view. The
former can be used to include user constraints, if any and the latter will be discussed later.
To synthesize the design, double click on the Synthesize Design option in the Processes
window.
To implement the design, double click the Implement design option in the Processes
window. It will go through steps like Translate, Map and Place & Route. If any of these
steps could not be done or done with errors, it will place a X mark in front of that, otherwise
a tick mark will be placed after each of them to indicate the successful completion. If
everything is done successfully, a tick mark will be placed before the Implement Design
option. If thereare
warnings, one can see mark in front of the option indicating that there are some warnings. One
can look at the warnings or errors in the Console window present at the bottom of the Navigator
window. Every time the design file is saved; all these marks disappear asking for a
freshcompilation.
Figure 11: Implementing the Design (snapshot from Xilinx ISE software)
The schematic diagram of the synthesized verilog code can be viewed by double clicking
View RTL Schematic under Synthesize-XST menu in the Process Window. This would be a
handy way to debug the code if the output is not meeting our specifications in the proto type
board.
By double clicking it opens the top level module showing only input(s) and output(s) as
shown below.
Figure 12: Top Level Hierarchy of the design
To check the functionality of a design, we have to apply test vectors and simulate the
circuit. In order to apply test vectors, a test bench file is written. Essentially it will supply
all the inputs to the module designed and will check the outputs of the module. Example:
For the 2 input OR Gate, the steps to generate the test bench is as follows:
In the Sources window (top left corner) right click on the file that you want to generate
the test bench for and select ‘New Source’
Provide a name for the test bench in the file name text box and select ‘Verilog test
fixture’ among the file types in the list on the right side as shown in figure 11.
Figure 14: Adding test vectors to the design (snapshot from Xilinx ISE software)
Click on ‘Next’ to proceed. In the next window select the source file with which you
want to associate the test bench.
Figure 15: Associating a module to a testbench (snapshot from Xilinx ISE software)
Click on Next to proceed. In the next window click on Finish. You will now be provided
with a template for your test bench. If it does not open automatically click the radio
button next to Simulation .
You should now be able to view your test bench template. The code generated would be
something like this:
moduleo_gate_tb_v;
//
Inp
uts
reg
a;
reg b;
//
Out
puts
wire
z;
.a(a),
.b(b),
.z(z)
);
initialbegin
// Initialize
Inputs a =
0;
b =0;
end
endmodule
The Xilinx tool detects the inputs and outputs of the module that you are going to test an assigns
them initial values. In order to test the gate completely we shall provide all the different input
combinations. ‘#100’ is the time delay for which the input has to maintain the current value.
After 100 units of time have elapsed the next set of values can be assign to the inputs.
Complete the test bench as shown below:
moduleo_gate_tb_v;
//
In
put
s
reg
a;
reg
b;
//
Out
puts
wire
z;
// Instantiate the Unit Under
Test (UUT) o_gateuut (
.a(a),
.b(b),
.z(z)
);
initialbegin
// Initialize
Inputs a =
0;
b =0;
a = 0;
b =1;
// Wait 100 ns for global
reset tofinish #100;
a = 1;
b =0;
a = 1;
b =1;
end
endmodule
Now under the Processes window (making sure that the testbench file in the Sources
window is selected) expand the ModelSim simulator Tab by clicking on the add sign next
to it. Double Click on Simulate Behavioral Model. You will probably receive a complier
error. This is nothing to worry about – answer “No” when asked if you wish to abort
simulation. This should cause ModelSim to open. Wait for it to complete execution. If you
wish to not receive the compiler error, right click on Simulate Behavioral Model and select
process properties. Mark the
Figure 16: Simulating the design (snapshot from Xilinx ISE software)
To save the simulation results, Go to the waveform window of the Modelsim simulator,
Click on File -> Print to Postscript -> give desired filename and location.
Else a normal print screen option can be used on the waveform window and subsequently
stored in Paint.
Figure 17: Behavioral Simulation output Waveform (Snapshot from
ModelSim)
For taking printouts for the lab reports, convert the black background to white in Tools ->
Edit Preferences. Then click Wave Windows -> Wave Background attribute.
CONCLUSION
After the result analysis concluded that latch based CG and LACG implementation are
efficient for Power optimization in circuit among other implementation of CG, but in Area
and Delay optimization the gate based clock gating implementation has been proved better
result, also gate based CG scheme reduces the complexity and cost of the design.
REFERENCES
[1]. Paliwal, P., Sharma, J. B., & Nath, V. (2019). Comparative study on FFA architectures
[2]. S. Daboul, N. Hahnle, S. Held, and U. Schorr, “Provably Fast and Near Optimum Gate
Sizing,” in IEEE Trans. on Computer-Aided Design ofIntegrated Circuits and Systems, vol.
[3]. Barman, J., & Kumar, V. (2018, May). Approximate Carry Look Ahead Adder (CLA)
[5]. Khushbu Chandrakar, Dr. Suchismita Roy, (2017), “A SAT-based Methodology for
[6]. Pritam Bhattacharjee, Alak Majumder, Tushar Dhabal Das, (2016), “A 90 nm Leakage
Control Transistor Based Cloc Gating for Low Power Flip Flop Applications”, in 59th
[7]. R Keerthi Kiran, Dr. A B Kalpana, (2015), “Low Power 8, 16 & 32 bit ALU Design
Using Clock Gating”, in International Journal of Scientific & Engineering Research, Volume