Professional Documents
Culture Documents
Mos Vlsi Report Final
Mos Vlsi Report Final
• Algorithm used
This method optimizes the conventional multiplier by recoding it in two's complement,
reducing the number of partial products for faster operation and minimal hardware usage.
The modified Radix-2 algorithm segments the multiplier into 3-bit groups, each decoded
to generate partial products efficiently. Let us consider Y as the multiplier and X as the
multiplicand.
Multiplier Y in two’s complement is written at [2] as:
Using the modified Booth algorithm, the recoded version of Y produces a set of five signed
….digits: -2, -1, 0, +1, and +2. Each of these recoded digits as shown in Table 2.1 plays a
….distinct role in the multiplication process with X.
Y2i+1 Y2i Y2i-1 Recoded digit Operation on X
0 0 0 0 0*X
0 0 1 1 1*X
0 1 0 1 1*X
0 1 1 2 2*X
1 0 0 2 2*X
1 0 1 -1 -1*X
1 1 0 -1 -2*X
1 1 1 0 0*X
Multiplier bits Y are grouped into overlapping 3-bit sets, each facilitating the calculation of a
specific partial product. These five multiples of the multiplicand are generated as shown in
Table 2.2 from the recoding of the multiplier Y. The process of generating the partial product
is shown below.
Table 2.3 shows the relationship between the generated partial product (Pi) and the
multiplicand, with Pn representing the sign bit, where Pn = Pn-1 when no partial product shifting
occurs. Notably, the partial product is represented with n + 1 bits.
For the second partial product, follow the given expression if the first partial product is positive;
otherwise, consider two different cases as shown per [2]
To efficiently indicate sign propagation, we can use a flag bit, F, which serves as a signal to
determine whether a previous partial product had a negative sign bit to be propagated. In the
given example Fig 2.1, F0 = 0 (no previous partial product before the first one), and F1 = F2 =
Fa = 1, indicating sign propagation from the first partial product to all subsequent ones. This
flag can be expressed using a specific Boolean equation as shown in [2].
is the sign bit of jth partial product.
• DVDD – 1.2 V
• DGND – 0 V
2.3 Architecture
• The circuit and description of the above-mentioned blocks are shown below:
1. Booth Encoder
The Booth encoder is an essential component of the Booth algorithm used in binary
multiplication. It encodes three bits of the multiplier, handles sign extension, and generates
control signals to guide the multiplier operation. A high-level description of the Booth
encoder's logic and implementation is discussed below.
• Input Bits:
𝑌2𝑖−1 : The least significant bit of the current group of three bits.
𝑌2𝑖+1 : The most significant bit of the current group of three bits.
• Complement Bits:
′
𝑌2𝑖−1 ,𝑌2𝑖′ , 𝑌2𝑖+1
′
: The complements of the input bits.
• Sign Extension:
Sign extension logic ensures that the Booth encoding considers the signs of the bits. Depending
on the encoding scheme (2's complement or others), the sign bits may need to be extended.
This is typically done by copying the most significant bit (𝑌2𝑖+1 ) to the left to fill the sign bits.
• Encoding Logic:
The Booth encoder's primary function is to generate control signals that guide the multiplier
operation (like selecting a 0, 1, or 2's complement of the partial product). The encoding logic
examines the three input bits and generates one of several possible encoding values based on
the Booth algorithm's rules.
For example, it can output 00, 01, 10, or 11 to represent different multiplication operations.
The encoding is often implemented using a truth table or a combination of logic gates,
depending on the specific implementation requirements.
Figure 2.3: Booth Encoder
• 0x
• 1x
• -1x
• -2x
• 𝑷𝑷𝒏+𝟏
2. PP-MUX
′
The basic functioning of the PP-MUX is to select either 𝑋𝑖−1 , 𝑋𝑖′ , 𝑋𝑖 , 𝑋𝑖−1 depending on the
control signal provided i.e. either 0x, 1x, 2x, -1x, -2x, these signals are generated based on the
recoded bits of multiplier Y as mentioned above in the encoder section.
The PP-MUX plays a crucial role in the Booth multiplication algorithm by dynamically
choosing the appropriate input data based on the current recoded multiplier bits. This selection
process ensures that the correct partial product is generated for each stage of the multiplication
operation, contributing to the overall accuracy and efficiency of the multiplication process.
The circuit implementation shown in Figure 2.9 represents a detailed representation of how the
PP-MUX functions. It embodies the intricate logic and control signals required to make precise
data selections, thus contributing to the overall functioning of the Booth multiplication
algorithm in high-performance digital systems.
Figure 2.11: PP-MUX Architecture
The primary responsibility of a multiplier cell is to compute and produce a single bit of the
accurate partial product.
The generated partial product bit is added to the cumulative sum, which has propagated from
preceding cells. This addition operation ensures the progressive accumulation of the product
bits towards the final result.
The multiplier cell comprises two integral components:
• Partial Product Multiplexer (PP-MUX): The PP-MUX plays a pivotal role in the
generation of the partial product bit. It selects the appropriate input bit, either from the
multiplicand or its complement, based on the control signals provided by the multiplier
algorithm. The chosen input is then subjected to the multiplication operation, and the
resulting bit is directed towards further processing.
• Adder Unit (Full Adder or Half Adder): Depending on the specific design and
operational requirements, the multiplier cell is equipped with either a Full Adder (FA) or a
Half Adder (HA). This component is responsible for the addition of the generated partial
product bit to the cumulative sum, effectively updating the sum for subsequent stages of
multiplication.
Notably, in the initial row of the multiplier, which corresponds to the least significant bits
….of the multiplication operation, only the generation of partial products is necessary.
….Consequently, each cell in this row exclusively comprises a PP-MUX circuit without the
….need for an adder component.
.Figure 2.10 provides a comprehensive block diagram illustrating the architecture and
….interconnection of components within the multiplier cell, underscoring the crucial role it
….plays in facilitating the multiplication process with precision and efficiency. This
….professional design ensures the correct generation of partial products and their systematic
….accumulation to produce the final product, adhering to the principles of digital
….multiplication.
4. 8-bit Adder
• The 8-bit Adder block has been designed employing the carry ripple logic architecture, with
specific optimizations applied to the individual Full Adder blocks.
• In the construction of each Full Adder, a specialized logic design known as mGDI
(Modified Gate Diffusion Input) logic has been strategically employed to craft the essential
XOR gates and mGDI basic cell, crucial for the generation of both the sum and carry
signals.
mGDI method
• The mGDI (Gate Diffusion Input) method is predicated on the utilization of a fundamental
cell, as depicted in Figure 2.11. Upon initial observation, this basic cell bears some
resemblance to the standard CMOS inverter. However, it distinguishes itself through
several critical disparities.
• The mGDI Basic Cell can implement a wide range of functions using only two gates,
that’s what gives this technology edge over normal static CMOS implementation.
• The individual blocks involved in the construction of the 8-bit Adder is shown below:
5. Add Cell
To obtain the two's complement of a binary number, a systematic procedure is followed.
Initially, the binary number is inverted, and subsequently, a logical "1" is added to the inverted
result. The Add Cell is responsible for generating the required "1" to be added, as and when
necessary. This process adheres to established principles of binary arithmetic and is commonly
employed in digital computation and representation. Figure 2.18 shows the architecture.
• A tapered buffer, also known as a taper buffer or simply a buffer, is an electronic circuit or
component commonly used in integrated circuits (ICs) and semiconductor devices. Its
primary function is to condition or modify the electrical signals passing through it. Here
are some key aspects of tapered buffers:
• Signal Conditioning:
Tapered buffers are used to shape or condition electrical signals. They can serve various
purposes, such as impedance matching, signal level adjustment, or driving a signal through
a long transmission line without significant loss.
• Signal Amplification:
In some cases, tapered buffers may amplify signals, ensuring they have enough strength to
be reliably processed by subsequent circuitry.
• Signal Isolation:
Tapered buffers provide isolation between different sections of a circuit, preventing
undesired interactions or interference.
3.3 Reset
N/A
4 Verification Strategy
4.1 Objectives
• Design:
The primary objective of this design is to implement an 8x8 Booth Multiplier, a crucial
component in digital arithmetic, capable of multiplying two sets of 8-bit binary numbers
efficiently. This design leverages the Booth algorithm to optimize the multiplication
process, reducing the number of partial products generated and improving computational
speed.
• Objective of Verification:
The main goal of verification is to test the functionality of each block within an integrated
circuit.
• Functionality Testing:
To verify functionality, the first step involves performing a DC (Direct Current) analysis.
This analysis aims to determine if the block generates the required output under steady-
state conditions.
• Critical Path Identification:
After confirming functionality, the next step is to identify the critical path within the block.
The critical path represents the longest delay path through the circuit, which determines the
overall circuit speed.
• Delay Calculation:
To calculate the critical path delay, a transient analysis is performed. A transient pulse
(input signal change) is applied to one of the inputs while keeping the other inputs constant.
The time difference between the generation of the output and the input transition is
measured, giving the critical path delay.
• PVT Analysis:
The same analysis process is repeated across different Process, Voltage, and Temperature
(PVT) variations. These variations include different Process corners (TT, FF, FS, SF, SS)
and multiple Voltage and Temperature variations. This comprehensive analysis across
various conditions is referred to as PVT analysis.
• Robustness Check:
PVT analysis is essential to assess the robustness of the circuit. It helps ensure that the
circuit functions correctly under different operating conditions, accounting for
manufacturing process variations, voltage fluctuations, and temperature changes.
• Post-Layout Analysis:
The entire verification and PVT analysis process is repeated after the layout of the
integrated circuit has been finalized. This is crucial because layout can impact signal
propagation delays and other factors that affect circuit performance.
In summary, this process is critical in the design and verification of digital integrated
…..circuits to ensure that they meet their specified functionality and performance criteria under
…..a range of real-world operating conditions, including variations in process technology,
…..voltage levels, and temperature.
• Block-Level Verification
This verification procedure is applied to all individual blocks within the integrated circuit. Each
block's output is connected to a load capacitor of 0.05pF to evaluate its ability to drive a specific
load. This step confirms both the functionality and output driving capability of the blocks.
• Post-Layout Extraction
The entire verification and analysis process is repeated after the layout phase, where the
physical placement and routing of components are finalized. This step is essential as layout
changes can impact signal propagation delays and other critical parameters.
5 Functional Checklist
In a booth multiplier design, Booth encoders and array cells are essential for efficient partial
product generation. Booth encoders handle encoding, decoding, and sign extension logic
propagation. Array cells generate specific bits added to the cumulative sum. In two's
complement, inverting a number involves bit flipping and adding one. The "ADD" cell
manages this process. The final stage includes a 16-bit adder split into two 8-bit adders from
which the 16-bit output of the multiplier is generated.
In summary, Booth encoders, array cells, and associated components are crucial in booth
multiplier design. They handle encoding, decoding, bit generation, and addition operations.
The final stage ensures accurate multiplication results. To verify the functionality of the block
first DC analysis is performed followed by the identification of the critical path to calculate the
critical path delay which determines the operating frequency of the multiplier.
6 Testbench
6.1 Overview
The verification and analysis process for digital integrated circuits is a critical phase in ensuring
the functionality, speed, and reliability of the designed circuit. This process involves a
systematic approach to validate each block and the top-level design. Here is an overview of the
key steps involved:
• Post-Layout Extraction
The entire verification and analysis process is repeated after the layout phase, where the
physical placement and routing of components are finalized. This step is essential as layout
changes can impact signal propagation delays and other critical parameters.
• Critical Path of the Top Level
Figure 6.1 provides a visualization of the critical path within the top-level structure of the 8x8
Booth Multiplier Architecture. This critical path determination involves a rigorous analysis
aimed at identifying the specific pathway that yields the most prolonged propagation delay,
consequently representing the worst-case scenario.
6.2 Architecture
• The comprehensive evaluation of the circuit begins with a rigorous DC Analysis, which
serves as a fundamental step in assessing the circuit's operational integrity. This analysis
enables us to ascertain whether the circuit functions as intended or if any potential issues
arise.
• Subsequently, we proceed to calculate the critical path delay, a crucial metric that gauges
the worst-case propagation delay within the circuit. This specific path, previously identified
as illustrated above, is of paramount importance in evaluating the circuit's overall
performance. To measure this delay, a pulse is applied at a designated input point, serving
as the initiation point of our critical path. The delay is then meticulously quantified by
measuring the time elapsed between the input signal's rising edge and the subsequent
change in the circuit's output.
• In Figure 6.2, we are presented with the testbench setup meticulously designed for the
execution of both the DC Analysis and the precise calculation of propagation delay. This
setup reflects a commitment to a methodical and professional approach to circuit
evaluation, ensuring that the circuit's functionality and timing characteristics are rigorously
assessed to meet the specified requirements and performance criteria.
• ADE_L Window for running DC Simulations for the top-level is shown in Figure 6.3
Figure 6.4: ADE_L Window for Critical Path Delay Pre Layout (Schematic)
• ADE_L Window for running transient sims for the calculation of critical path delay Pre-
layout is shown in Figure 6.4.
Figure 6.5: Corner Setup
• The same analysis is repeated Post-Layout after exporting the calibre file of the top level
and running the simulations using that, this happens after the parasitic extraction of the
Layout.
Figure 6.7: ADE_L Window for Critical Path Estimation Post Layout Extraction
• ADE_L Window for running transient sims for the calculation of critical path delay Post-
layout extraction is shown in Figure 6.7.
Figure 6.8: ADE_XL Window for PVT Analysis Post-Layout Extraction
• Figure 6.8 shows the ADE_XL window comprising of the 45-corner analysis of the
Critical Path delay post layout.
7 Tests Specifications
• All the test suits and functional tests done is mentioned in great depth in the previous
section.
8 Design Microarchitecture
• For PP-HA different block is not used the Cin input is grounded in PP-FA.
8.2 Sub-Block Description
• The following blocks are implemented in order for the booth multiplier to function:
• Booth Encoder
• PP-MUX
• PP-FA
• 8-bit Adder
• Add Cell
• Tapered Buffer
• The circuit and detailed description of the above-mentioned blocks is shown in detail in
Section 2.4.
9.1 Floorplanning
Figure 9.1 provides the complete Layout view with major pins marked using TEXT layer.
A. Placement Plan:
B. Routing Plan:
• Std cells like inverter, XOR and buffer have a fixed height of 1.2um to ease abutment in X
direction in top level.
• PP/NP implant removed at PCELL level and custom drawn to compress standard cell
height.
• Each cell, irrespective of it being a standard cell or a complex block, has a standard
power/GND rail with 0.4um width in M4 at equal intervals.
• This creates uniformity in the top-level power/GND rail.
• Half DRC for abutting blocks maintained.
• Custom TAP cells used instead of strips of MPPs as we need substrate biasing only every
30um to avoid Latch up.
• Blocks having globally routed signals, have been floated inside its own block so that a
distinct track forms when multiple such blocks are instantiated in the top level.
• This section aims to provide an insight into the performance of the 8x8 Booth Multiplier
concerning speed across diverse process variations, voltage levels, and temperature
conditions.
• Additionally, it will offer an overview of the utilized area, both at the individual block level
and within the top-level layout.
• This comprehensive analysis forms a critical part of the assessment, enabling a thorough
understanding of the multiplier's behaviour under various operational scenarios and its
corresponding physical footprint.
10.1 Area
• In this section, a detailed examination of the critical path delay and the resulting worst-case
operating frequency is presented.
• The evaluations are conducted at both the typical corner (TT) and the most challenging
corner (SS) stages, both pre and post layout implementation.
• This comprehensive assessment offers valuable insights into the system's performance
under varying conditions, facilitating a comprehensive understanding of its operational
capabilities and potential optimizations.
• This analysis encompasses all individual blocks within the architecture and extends to the
entire system as shown below:
1. Booth Encoder
Load Capacitance used: 50fF
2. Multiplexer
Load Capacitance used: 30fF
3. Add Cell
Load Capacitance used: 20fF
6. Tapered Buffer
Load Capacitance used: 100fF
Table 10.8: Critical Path Delay for 8-bit Booth Multiplier Block
Conclusion
• The data presented in the table above reveals noteworthy performance characteristics.
Specifically, at the typical corner (TT), the pre-layout speed attains a commendable 3.418
GHz, while at the slow corner (SS), it registers at 1.935 GHz, with an associated output
load of 0.1pF.
• Furthermore, the post-layout assessment indicates that at the typical corner (TT), the speed
stands at 1.524 GHz, while at the slow corner (SS), it operates at 0.886 GHz under the same
0.1pF output load condition.
• In light of the project's stipulated specifications, where the minimum required operating
frequency is set at 500MHz, it is noteworthy that the achieved speeds align seamlessly with
the anticipated operating frequency. This affirmation underscores the successful fulfillment
of the project's performance requirements and underscores the proficiency of the design
and layout implementations.