Professional Documents
Culture Documents
M8final Doc (1) Vin
M8final Doc (1) Vin
CHAPTER 1
INTRODUCTION
The APB is part of the AMBA 3 protocol family. It provides a low-cost interface
that is optimized for minimal power consumption and reduced interface complexity.
The APB interfaces to any peripherals that are low-bandwidth and do not require
the high performance of a pipelined bus interface.
The APB has unpipelined protocol. All signal transitions are only related to the
rising edge of the clock to enable the integration of APB peripherals easily into any
design flow. Every transfer takes at least two cycles.
The APB can interface with the AMBA Advanced High-performance Bus Lite
(AHB-Lite) and AMBA Advanced Extensible Interface (AXI). You can use it to
provide access to the programmable control registers of peripheral devices
1.2 OBJECTIVE
Most of the students of Electronics Engineering are exposed to Integrated
Circuits (IC's) at a very basic level, involving SSI (small scale integration) circuits like
logic gates or MSI (medium scale integration) circuits like multiplexers, parity encoders
etc. But there is a lot bigger world out there involving miniaturization at levels so great,
that a micrometer and a microsecond are literally considered huge! This is the world of
VLSI - Very Large Scale Integration. The article aims at trying to introduce Electronics
Engineering students to the possibilities and the work involved in this field.
VLSI stands for "Very Large Scale Integration". This is the field which involves
packing more and more logic devices into smaller and smaller areas. Thanks to VLSI,
circuits that would have taken boardfulls of space can now be put into a small space
few millimetres across! This has opened up a big opportunity to do things that were not
possible before. VLSI circuits are everywhere ... your computer, your car, your brand
new state-of-the-art digital camera, the cell-phones, and what have you. All this
involves a lot of expertise on many fronts within the same field, which we will look at
in later sections. VLSI has been around for a long time, there is nothing new about it ...
but as a side effect of advances in the world of computers, there has been a dramatic
proliferation of tools that can be used to design VLSI circuits. Alongside, obeying
Moore's law, the capability of an IC has increased exponentially over the years, in terms
of computation power, utilization of available area, yield. The combined effect of these
two advances is that people can now put diverse functionality into the IC's, opening up
new frontiers. Examples are embedded systems, where intelligent devices are put inside
everyday objects, and ubiquitous computing where small computing devices proliferate
to such an extent that even the shoes you wear may actually do something useful like
monitoring your heartbeats! These two fields are kind a related and getting into their
description can easily lead to another article.
1. Circuit Delays. Large complicated circuits running at very high frequencies have one
big problem to tackle - the problem of delays in propagation of signals through gates
and wire even for areas a few micrometers across! The operation speed is so large that
as the delays add up, they can actually become comparable to the clock speeds.
2. Power. Another effect of high operation frequencies is increased consumption of
power. This has two-fold effect - devices consume batteries faster, and heat dissipation
increases. Coupled with the fact that surface areas have decreased, heat poses a major
threat to the stability of the circuit itself.
3. Layout. Laying out the circuit components is task common to all branches of
electronics. What so special in our case is that there are many possible ways to do this;
there can be multiple layers of different materials on the same silicon, there can be
different arrangements of the smaller parts for the same component and so on.
The power dissipation and speed in a circuit present a trade-off; if we try to optimize
on one, the other is affected. The choice between the two is determined by the way we
chose the layout the circuit components. Layout can also affect the fabrication of VLSI
chips, making it either easy or difficult to implement the components on the silicon.
breaking established design ideas to receive the last little bit of executeance through
buying and selling balance).
What is VLSI?
• Integrated circuit (IC) may contain millions of transistors, both a few mm in
size
• Applications wide ranging: most electronic logic devices
Advantages of ICs above discrete components
While we will be able to be aware of integrated circuits , the houses of integrated
circuits-what we can and cannot appropriately put in an integrated circuit-mostly verify
the architecture of the complete approach. Built-in circuits strengthen procedure
characteristics in a couple of imperative methods. ICs have three key advantages above
digital circuits built from discrete components:
Measurement. Built-in circuits are so much smaller-both transistors and wires are
reduced in size to micrometer sizes, evaluated to the millimeter or centimeter scales of
discrete add-ons. Small size results in advantages in pace and power consumption,
when you consider that smaller components have smaller parasitic resistances,
capacitances, and inductances.
Speed Signals can be switched amongst logic zero and good
judgment 1 much faster within a chip than they may be able to amongst chips. Statement
inside of a chip can occur 1000's of times prior than communique amongst chips on a
printed circuit board. The excessive pace of circuits on-chip is due to their small
measurement-smaller constituent and wires have smaller parasitic capacitance to slow
down the signal.
Power consumption. Common sense operations inside a chip
also take so much less power. As soon as again, curb vigour consumption is essentially
due to the small measurement of circuits on the chip-smaller parasitic capacitances and
resistances require much less vigor to pressure them.
1.5.1 Asic
An Application-Exact Integrated Circuit (ASIC) is an integrated circuit (IC)
Customized for a designated use, as a substitute than supposed for general-intent use.
For illustration, a chip designed exclusively to run a cell phone is an ASIC. Intermediate
amongst ASICs and enterprise general built-in circuits, like the 7400 or the 4000 series,
are utility detailed normal merchandise (ASSPs).
As characteristic sizes have gotten smaller and design tools improved above the
years, the highest trouble (and as a consequence performance) feasible in an ASIC has
grown from 5,000 gates to above one hundred million. Cutting-edge ASICs almost
always include entire 32-bit processors, memory blocks in conjunction with ROM,
RAM, EEPROM, Flash and different big building blocks. Such an ASIC is traditionally
termed a SoC (procedure-on-a-chip). Designers of digital ASICs use a hardware
description language (HDL), corresponding to Verilog or VHDL, to describe the
functionality of ASICs.
Field-programmable gate arrays (FPGA) are the present day-day technological
know-how for building a breadboard or prototype from common materials;
programmable good judgment blocks and programmable interconnects permit the
identical FPGA to be used in many one-of-a-kind applications. For smaller designs
and/or lower production volumes, FPGAs may be more cost effective than an ASIC
design even in production.
• An application-exact integrated circuit (ASIC) is an integrated circuit (IC) customized
for a particular use, rather than intended for general-purpose use.
• A Structured ASIC falls among an FPGA and a Standard Cell-based ASIC
• Structured ASIC’s are used mainly for mid-volume level designs
• The design task for structured ASIC’s is to map the circuit into a fixed arrangement
of known cells.
Among different arithmetic blocks, the multiplier is one of the main blocks,
which is widely used in different applications especially signal processing applications.
There are two general architectures for the multipliers, which are sequential and
parallel. While sequential architectures are low power, their latency is very large. On
the other hand, parallel architectures (such as Wallace tree and Dadda) are fast while
having high-power consumptions. The parallel multipliers are used in high-
performance applications where their large power consumptions may create hot-spot
locations on the die. Since the power consumption and speed are critical parameters in
the design of digital circuits, the optimizations of these parameters for multipliers
become critically important. Very often, the optimization of one parameter is performed
considering a constraint for the other parameter. Specifically, achieving the desired
performance (speed) considering the limited power budget of portable systems is
challenging task. In addition, having a given level of reliability may be another obstacle
in reaching the system target performance.
To meet the power and speed specifications, a variety of methods at different
design abstraction levels have been suggested. Approximate computing approaches are
based on achieving the target specifications at the cost of reducing the computation
accuracy. The approach may be used for applications where there is not a unique answer
and/or a set of answers near the accurate result can be considered acceptable. These
applications include multimedia processing, machine learning, signal processing, and
other error resilient computations. Approximate arithmetic units are mainly based on
the simplification of the arithmetic units circuits. There are many prior works focusing
on approximate multipliers which provide higher speeds and lower power
consumptions at the cost of lower accuracies. Almost, all of the proposed approximate
multipliers are based on having a fixed level of accuracy during the runtime. The
runtime accuracy re configurability, however, is considered as a useful feature for
providing different levels of quality of service during the system operation. Here, by
reducing the quality (accuracy), the delay and/or power consumption of the unit may
be reduced. In addition, some digital systems, such as general purpose processors, may
be utilized for both approximate and exact computation modes. An approach for
achieving this feature is to use an approximate unit along with a corresponding
correction unit. The correction unit, however, increases the delay, power, and area
overhead of the circuit. Also, the error correction procedure may require more than one
clock cycle, which could, in turn, slowdown the processing further.
In this paper, we present four dual-quality reconfigurable approximate 4:2
compressors, which provide the ability of switching between the exact and approximate
operating modes during the runtime. The compressors may be utilized in the
architectures of dynamic quality configurable parallel multipliers .The basic structures
of the proposed compressors consist of two parts of approximate and supplementary.
In the approximate mode, only the approximate part is active whereas in the exact
operating mode, the supplementary part along with some components of the
approximate part is invoked.
using sophisticated compilers. In the fault portioning is focused to increase the fault
coverage. After that parallel ATPG has been proposed which uses on-chip multi-core
era, but the ATPG components among multiple processing units are placed on the same
chip. Next various parallel ATPG methods such as circular pipeline parallel and GPU
based ATPGs are came into existence but they are having some disadvantages like same
test sets are re-generating each time and limits the speed-up scalability.
In this work we propose a parallel test pattern generation methodology which
uses the shared-memory multi-core systems geared towards high speed. To generate
the pseudo random test patterns Low Power LFSR is used which consumes less power
and delay. In Low Power LFSR the patterns are in Gray code by which switching
activity less compared to LFSR.
Digital signal processing and analog signal processing are subfields of signal
processing. DSP applications include audio and speech processing, sonar, radar and
other sensor array processing, spectral density estimation, statistical signal processing,
digital image processing, signal processing for telecommunications, control systems,
biomedical engineering, seismology, among others.
The application of digital computation to signal processing allows for many advantages
over analog processing in many applications, such as error detection and correction in
transmission as well as data compression.[2] DSP is applicable to both streaming data
and static (stored) data.
CHAPTER 2
XILINX SOFTWARE
Pc with windows 10, windows 8, and windows 7 are used to install the
software which we used to implement the project. In our project we implemented on
windows 8 and windows 10 of 64 kb.
Xilinx ISE is a design environment for FPGA products from Xilinx, and is
tightly-coupled to the architecture of such chips, and cannot be used with FPGA
products from other vendors. The Xilinx ISE is primarily used for circuit synthesis and
design, while ISIM or the Models in sim logic simulator is used for system-level
testing. Other components shipped with the Xilinx ISE include the Embedded
Development Kit (EDK), a Software Development Kit (SDK) and Chip Scope Pro.
Since 2012, Xilinx ISE has been discontinued in favor of Viva do Design Suite, that
serves the same roles as ISE with additional features for system on a chip development.
Xilinx released the last version of ISE in October 2013 (version 14.7), and states that
"ISE has moved into the sustaining phase of its product life cycle, and there are no more
planned ISE releases."
Verilog's concept of 'wire' consists of both signal values (4-state: "1, 0, floating,
undefined") and signal strengths (strong, weak, etc.). This system allows abstract
modeling of shared signal lines, where multiple sources drive a common net. When a
wire has multiple drivers, the wire's (readable) value is resolved by a function of the
source drivers and their strengths.
2.3.1 Begining
2.3.2 Verilog– 95
With the increasing success of VHDL at the time, Cadence decided to make the
language available for open standardization. Cadence transferred Verilog into the
public domain under the Open Verilog International (OVI) (now known as Accellera)
organization. Verilog was later submitted to IEEE and became IEEE Standard 1364-
1995, commonly referred to as Verilog-95.
In the same time frame Cadence initiated the creation of Verilog-A to put
standards support behind its analog simulator Spectre. Verilog-A was never intended
to be a standalone language and is a subset of Verilog-AMS which encompassed
Verilog-95.
2.3.3 V 2001
Not to be confused with System Verilog , Verilog 2005 (IEEE Standard 1364-
2005) consists of minor corrections, spec clarifications, and a few new language
features (such as the uwire keyword).
CHAPTER 3
BLOCK DIAGRAM AND ITS TRANSFER STATES
Keypad, Timer and PIO (Peripheral Input Output) devices are connected to the APB.
The bridge connects the high performance AHB or ASB bus to the APB bus[4]. So,
for APB the bridge acts as the master and all the devices connected on the APB bus
acts as the slave. The component on the high performance bus initiates the
transactions and transfer them to the peripherals connected on the APB. So, at a time
the bridge is used for communication between the high performance bus and the
peripheral devices.
3.2 TRANSFERS
This data describes typical AMBA 3 APB write and read transfers, and the error
response. It contains the following sections:
• Write transfers
• Read transfers
• Error response
The write transfer starts with the address, write data, write signal and select signal all
changing after the rising edge of the clock. The first clock cycle of the transfer is called
the Setup phase. After the following clock edge the enable signal is asserted,
PENABLE, and this indicates that the Access phase is taking place. The address, data
and control signals all remain valid throughout the Access phase. The transfer
completes at the end of this cycle.
The enable signal, PENABLE, is deasserted at the end of the transfer. The select signal,
PSELx, also goes LOW unless the transfer is to be followed immediately by another
transfer to the same peripheral.
It is recommended that the address and write signals are not changed immediately after
a transfer but remain stable until another access occurs. This reduces power
consumption.
CHAPTER 4
OPERATING STATES
4.1 LIST OF APB SIGNALS
PCLK Clock source Clock. The rising edge of PCLK times all transfers on the APB.
PRESETn System bus equivalent Reset. The APB reset signal is active LOW. This
signal is normally connected directly to the system bus reset signal.
PADDR APB bridge Address. This is the APB address bus. It can be up to 32 bits wide
and is driven by the peripheral bus bridge unit.
PSELx APB bridge Select. The APB bridge unit generates this signal to each peripheral
bus slave. It indicates that the slave device is selected and that a data transfer is required.
There is a PSELx signal for each slave.
PENABLE APB bridge Enable. This signal indicates the second and subsequent cycles
of an APB transfer.
PWRITE APB bridge Direction. This signal indicates an APB write access when HIGH
and an APB read access when LOW.
PWDATA APB bridge Write data. This bus is driven by the peripheral bus bridge unit
during write cycles when PWRITE is HIGH. This bus can be up to 32 bits wide.
PREADY Slave interface Ready. The slave uses this signal to extend an APB transfer.
PRDATA Slave interface Read Data. The selected slave drives this bus during read
cycles when PWRITE is LOW. This bus can be up to 32-bits wide.
PSLVERR Slave interface This signal indicates a transfer failure. APB peripherals are
not required to support the PSLVERR pin. This is true for both existing and new APB
peripheral designs. Where a peripheral does not include this pin then the appropriate
input to the APB bridge is tied LOW.
4.2 OPERATING STATES
T he basic state machine that represents operation of the peripheral bus. There are
three states namely, IDLE, SETUP and ACCESS state.
IDLE state is the default state in which no operation is being performed. The
assertion of the PSEL signal indicates the beginning of the SETUP phase. The bus
enters into the SETUP phase when the data transfer is required. The PWRITE, PADDR
and PWDATA are also provided during this phase. The bus remains in the SETUP
phase for one clock cycle and on the next rising edge of the clock, the bus will move to
the ACCESS state.
The assertion of the PENABLE signal indicates the start of the ACCESS phase.
All the control signals, address, and the data signals remains stable during the transition
from the SETUP phase to the ACCESS phase. In case of read operation the PRDATA
is present on the bus during this phase. PENABLE signal also remain high for one clock
cycle. If no further data transfer is required, the bus will move the IDLE state. But, if
further data transfer is required then the bus will move to the SETUP phase.
During the write transfer operation, the PSEL, PWRITE, PADDR and
PWDATA signals are asserted at the T1 clock edge which is called the SETUP cycle.
At the next rising edge of the clock T2, the PENABLE signal and PREADY signal are
asserted. This is called the ACCESS cycle. At the clock edge T3, PENABLE signal is
disabled and if further data transfer is required, a high to low transition occurs on the
PREADY signal.
CHAPTER 5
SOFTWARE REQUIREMENTS
5.1 XILINX ISE
XILINX 14.7
Xilinx software is required for both VHDL and VERILOG designers to perform
synthesis operation. Any simulated code is synthesized and configured on FPGA. The
process of changing VHDL code into gate level net list is called synthesis. It is the main
part of current design flows.
Select the Verilog Source by giving the required inputs, outputs and buffers, and a
window is displayed to write the verilog code and is synthesized.
The Figure 5.3 shows creating a new source, where select the project menu and
then select new source. Therefore, the new source is created depends on given
conditions and requirements.
• Select source type as Verilog module
Figure 5.4 shows type of source selection where select source type as verilog
module and write file name and it gives location of filename in the particular drive.
Select add to project and click next.
Verilog synthesis tools could create logical circuit structures directly from verilog
behavioral description and target them to a selected technology for realization (i.e.,
convert verilog to actual hardware).
By using verilog, design, simulation and synthesis are performed by a simple
combinational circuit to complete microprocessor on chip.
Verilog HDL is a standard hardware description language. Verilog HDL is having many
useful features for hardware design.
Verilog HDL is a general purpose hardware description language, which is to learn and
use easily. The syntax for Verilog is same as C programming language. Designer says
that who has experience with C programming they can easily learn Verilog HDL.
It allows different levels of modeling mixed in the same model. Hence, switches, gates,
RTL, or behavioral code of modeling hardware are defined by designer. Also, designer
learns easily one language for incentive and hierarchical design.
Verilog HDL supports the popular logic synthesis. This makes the designers can choose
any language.
The Programming Language Interface (PLI) is feature in which C code is written to
interact with Verilog data structures.
5.1.4 SYNTHESIS
It is the process of building gate level from register-transfer level circuit model
explained in Verilog HDL. This system is an intermediate step to produce a netlist
comprising of register-transfer level blocks like flip flops, arithmetic &logical units,
and multiplexers, which is interconnected with wires. In this case, the second program
is called the RTL module, which is necessary. The purpose of this is to acquire the
predefined components from a library and each RTL block is used in the user-specified
target technology.
Verilog HDL consists of synthesis and RTL module, where the parameters such
as power consumption, delay, area and the usage of memory are found. RTL module
gives the project overview in the form of figure.
Having produced gate level netlist, logic optimizer reads the netlist and reduces
the circuit sis satisfied for specified area and timing constraints. These parameters may
also be used by the module builder for appropriate selection or generation of RTL
blocks. In this, we assume that the target net list is at the gate level. The logic gates are
used.
CHAPTER 6
SIMULATION & SYNTHESIS RESULTS
CONCLUSION
This paper gives an overview of the AMBA bus architecture and discusses the APB bus
in detail. The APB bus is designed using the verilog HDL according to the specification
and is verified using Universal Verification Methodology. The simulation results show
that the data read from a particular memory location is same as the data written to the
given memory location. Hence, the design is functionally correct. The UVM report
summary also ensures the functional correctness of the design.
The electronic system level model of the same design will be created in the future since
ESL is the requirement of the future because of increasing design complexity. The ESL
model of the APB design will be created using System C. Then the design will be
verified using UVM testbench. The results obtained after the simulation will be
compared with the results obtained in this paper.
REFERENCES
[1] ARM, “AMBA Specification Overview”, http://www.arm.com/. .
[3] Akhilesh Kumar, Richa Sinha, “Design and Verification analysis of APB3
Protocol with Coverage,” IJAET, Nov 2011.
[6] Samir Palnitkar, “Verilog HDL: A guide to Digital Design and Synthesis (2nd
Edition), Pearson, 2008.
[7] Chris Spear, “SystemVerilog for verification (2nd Edition): A guide to learning
the testbench features, Springer, 2008.
[8] URL:http://www.testbench.com.
[10] Vanessa R. Cooper, “Getting Started with UVM: A Beginner’s Guide,” Verilab,
2013.
APPENDIX
READ TRANSFER MODULE:
module read_transfer(clk,addr,write,sel,enable,wdata,rdata,ready);
input clk;
input [7:0]addr;
input write;
input sel;
input enable;
input [7:0]wdata;
output reg [7:0]rdata;
output reg ready;
reg [7:0]ram[0:255];
always@(posedge clk)
begin
if(write&&!sel)
begin
rdata=8'b0;
ready=1'b0;
ram[addr]=wdata;
end
else if(!write&&sel)
rdata=ram[addr];
else if(enable)
begin
ready = 1'b1;
end
end
endmodule
module tb_rd_trtansfer;
// Inputs
reg clk;
reg [7:0]addr;
reg write;
reg sel;
reg enable;
reg [7:0] wdata;
// Outputs
wire [7:0] rdata;
wire ready;
always
#50 clk=~clk;
initial begin
// Initialize Inputs
clk = 0;
addr = 8'h45;
write = 1;
sel = 0;
enable = 0;
wdata = 8'hab;
#100
enable = 1;
end
endmodule
begin
wdata=8'b0;
ready=1'b0;
ram[addr]=data_in;
end
else if(write&&sel)
wdata=ram[addr];
else if(enable)
ready = 1'b1;
end
module tb_wr_trtansfer;
// Inputs
reg clk;
reg [7:0]addr;
reg write;
reg sel;
reg enable;
reg [7:0] data_in;
// Outputs
wire [7:0] wdata;
wire ready;
always
#50 clk=~clk;
initial begin
// Initialize Inputs
clk = 0;
addr = 8'h45;
write = 0;
sel = 0;
enable = 0;
data_in = 8'hab;
write = 1;
sel = 1;
#100
enable = 1;
end
endmodule