Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Digital Circuit Design 2

10636321
Dr. Ashraf Armoush

© 2023 Dr. Ashraf Armoush

Field Programmable Gate Array


(FPGA)

© 2023 Dr. Ashraf Armoush


Outline
• PLD vs. ASIC
• FPGA
• FPGA Structure
• Programming Technologies
• FPGA Architecture
• Embedded RAM, Multipliers, Adders and MACs
• Embedded Processor Cores
• Clock Trees and Clock Managers
• Programming (Configuring) an FPGA
• JTAG Port
© 2023 Dr. Ashraf Armoush , An-Najah National University 3

Programmable Logic Devices (PLDs)

 Highly configurable.
 Fast design and modification times.
 Couldn’t support large or complex functions.

© 2023 Dr. Ashraf Armoush , An-Najah National University 4


ASIC: Application-Specific Integrated Circuits
An ASIC: is an integrated circuit designed specifically for a special
purpose or application.
 This also implies that an ASIC is built only for one and only one
customer. (e.g. an IC designed for a specific line of mobile phones of a
company)

 Support extremely large and complex functions.


 Painfully expensive and time-consuming to design.

© 2023 Dr. Ashraf Armoush , An-Najah National University 5

The Gap between PLD and ASICs

• In order to overcome this gap between PLDs and ASICs, Xilinx developed
a new class of IC called a field-programmable gate array (FPGA). [1985]
 FPGA can be customized in the field like PLDs.
 FPGA can contain millions of logic gates and implement extremely
complex functions that previously could be realized only using ASICs.
 The cost of an FPGA design is much lower than that of an ASIC.
 Implementation design changes is much easier in FPGAs, and the time to
market for such designs is much faster.

© 2023 Dr. Ashraf Armoush , An-Najah National University 6


Technology Timeline

© 2023 Dr. Ashraf Armoush , An-Najah National University 7

FPGA
• FPGA: is a digital integrated circuit (IC) that contains configurable
blocks of logic along with configurable interconnects between these
blocks.
– “Field Programmable” refers to the fact that its programming takes
place “in the field” (an opposed to devices whose internal functionality is
hardwired by the manufacturer.

– In-System Programmable (ISP): if a device is capable of being


programmed while remaining resident in a higher-level system.

– FPGAs can be programmed on a higher level with various Hardware


Description Languages (HDL).

– The translation to gate level is done by tools automatically

© 2023 Dr. Ashraf Armoush , An-Najah National University 8


FPGA Applications
• Today’s FPGAs contain millions of gates and can be used in many
applications such as ( digital signal processing, software-defined radio,
aerospace and defense systems, ASIC prototyping, medical imaging, computer
vision, speech recognition, cryptography,, computer hardware emulation, radio
astronomy, metal detection) and a growing range of other areas.

• In general, FPGAs applications can be categorized into 5 major segments:


1. ASIC and custom silicon: To implement a variety of designs that could
previously realized using only ASICs and custom silicon .
2. Digital signal processing (DSP): Today’s FPGAs can contain embedded
multipliers, dedicated arithmetic routing, and large amount of on-chip RAM
to facilitate High-Speed DSP.
3. Embedded microcontrollers: FPGA are becoming attractive for embedded
control applications due to the falling price of FPGA and the available
capability to implement a soft processor core.
4. Physical layer communication: Implement the interfaces between the
physical layer communication chip and higher level network protocol layers .
5. Reconfigurable computing: The inherent parallelism and reconfigurability
give the ability to make substantial changes to the data path itself in addition
to the control flow during runtime.
© 2023 Dr. Ashraf Armoush , An-Najah National University 9

FPGA Structure
• The most common architecture consists of:
1. Configurable Logic Block (CLB) or Logic Array Block (LAB)
2. Configurable I/O Block (IOB)
3. Programmable Interconnect

© 2023 Dr. Ashraf Armoush , An-Najah National University 10


General Structure of an FPGA

© 2023 Dr. Ashraf Armoush , An-Najah National University 11

Programmable (Configurable) Logic Block


• The simple programmable logic block consists of:
1. Lookup Table (LUT): By means of SRAM programming cells, every logic block can
be configured to perform a different function.
2. A Register that could act as a flip-flop or a latch: if the flip-flop option is selected
the register can be configured to be triggered by a positive or negative-going clock
(common to all of the logic blocks ).
3. Multiplexer: can be configured to accept the output from the LUT or a separate
input.

• The logic blocks in modern FPGAs can be significantly more complex.


• Each FPGA contains a large number of programmable logic block.
© 2023 Dr. Ashraf Armoush , An-Najah National University 12
Lookup Tables (LUTs)
• Assume that a LUT was required to perform the function: y=(ab)+c’

Simplified LUT

• Note: by means of its own SRAM, the interconnect can be programmed


such that the primary inputs are connected to the inputs of one ore more
CLBs, and the outputs from any logic block can be used to drive the inputs
to any other logic block, the primary outputs from the device, or both.

© 2023 Dr. Ashraf Armoush , An-Najah National University 13

SRAM-Based Devices
• The majority of FPGAs are based on the use of SRAM configuration cells which
can be configured over and over again:
• Advantages:
 The new design can be quickly implemented and tested.
 The FPGA can be initially be programmed to perform some test before
reprogrammed during start up.
 The SRAM cells are created using exactly the same CMOS technology as
the rest of the device (no special processing steps)

• Disadvantages
 SRAM-based devices have to be reconfigured every time the system is
powered up. (requires the use of a special external memory device)
 Security: It can be difficult to protect your intellectual property (IP). This is
because the configuration file is stored in some form of external memory.

 Some of today’s SRAM-based FPGAs support the concept of bit-stream


encryption, where the final configuration data is encrypted before being
stored in the external memory.

© 2023 Dr. Ashraf Armoush , An-Najah National University 14


Antifuse-Based Devices
• Unlike SRAM-based devices, which are programmed while resident in the
system, antifuse-base are programmed off-line using a special device
programmer.
• Advantages:
 Nonvolatile, which means that they are immediately available as soon
as power is available without the need for external memory.
 Their interconnect structure is naturally “rad hard” which means that
they are relatively immune to the effect of radiation. (suitable for
military and aerospace applications)
 Lower power consumption and faster than SRAM-based ???

• Disadvantages
 The main disadvantage associated with antifuse-based devices is that
they are OTP (One Time Programmable). This makes these
components a poor choice for use in a development or prototyping
environment.

© 2023 Dr. Ashraf Armoush , An-Najah National University 15

EEPROM/ Flash Based Devices


• Once programmed, the data they contain is Nonvolatile.
• Can be programmed off-line.
• Some versions are in-system programmable, but their programming time is
about 3 times that of an SRAM-based component.
• Protection:
– Some of these devices use the concept of a multibit key, which can range
from 50 bits to several hundred.
– Once you have programmed the device, you can load your user defined key
to secure its configuration data.
– After the key has been loaded, the only way to read the data out of the
device, or to write new data, is to load a copy of your key via the JTAG port.
– With current speed of the JTAG port (20Mhz), it would take billions of years
to crack the key by exhaustively trying every possible value.
• Disadvantages:
– Require around 5 additional process steps on top of the standard CMOS
technology, which results in their lagging by one generation behind SRAM-
based devices.
– Have relatively high static power consumption.
© 2023 Dr. Ashraf Armoush , An-Najah National University 16
Summary of programming technologies
Feature SRAM Antifuse E2PROM/Flash
Technology node State-of-the-art One or more One or more generation
generation behind behind

Reprogrammable Yes (in system) NO Yes (in-system or offline)

Reprogramming speed Fast --- 3x slower than SRAM

Volatile Yes No No

External configuration file Yes No No

Good for Prototyping Yes (very good) No Yes (reasonable)

Instant-on No Yes Yes

IP-Security Acceptable Very Good Very Good

Size of configuration cell Large (6 transistors) Very small Medium-Small (2 trans.)

Power consumption Medium Low Medium

Rad Hard No Yes Not really

© 2023 Dr. Ashraf Armoush , An-Najah National University 17

Fine-, Medium-, Coarse-grained Architecture


• Based on the size of the logic blocks, it is common to categorize FPGA offerings
as being either:
 Fine grained architecture: Each logic block can be used to implement only a
very simple function( AND, OR, Flip-Flop, etc).

 Coarse grained: Each block contains a relatively large amount of logic


compared to the find-grained architecture( For example a logic block might
contain 4-input LUTs, four MUXs, four D flip-flops, and some fast carry logic

 Number of companies have recently started developing really coarse-grained


device architectures comprising arrays of nodes, where each node is a highly
complex processing element ranging from an algorithmic function, to a
complete general purpose microprocessor core. [ Are these devices classed as
FPGAs???]
 Medium grained: LUT-based FPGA are now often classed as medium-grained
to leave the coarse-grained application free to be applied to these new node-
based devices.
© 2023 Dr. Ashraf Armoush , An-Najah National University 18
MUX-based Logic Blocks
• The device can be programmed such that each input to the block is
presented with a logic 0, a logic 1, or the true or inverse of a signal
coming from another block or from a primary input to the device.

© 2023 Dr. Ashraf Armoush , An-Najah National University 19

LUT-based Logic Blocks


• A group of input signals is used as an index to a lookup table.
• The contents of this table are arranged such that the cells contains the
values for the different combinations of the input signals.
• The LUT is formed from SRAM (but it could be formed using antifuses,
EEPROM, or FLASH cells)

© 2023 Dr. Ashraf Armoush , An-Najah National University 20


MUX-based vs. LUT-based
• When engineers handcrafted their circuits prior to the advent of
today’s sophisticated CAD tools, some folks say that it was possible
to achieve best results using MUX-based architecture.

• During the 1990s, FPGA were widely used in the


telecommunications and networking markets. Both areas involves
pushing lots of data around, in which case LUT-based architecture
hold the high ground.

• As design grew larger and synthesis technology increased in


sophistication, handcrafting circuits became a thing of the past.

 The end result is that the majority of today’s FPGA architectures


are LUT-based.

© 2023 Dr. Ashraf Armoush , An-Najah National University 21

3-, 4-, 5-, or 6-input LUTs?


 Adding more inputs allows you to represent more complex
functions.
 Every time you add an input, you double the number of SRAM cells

 The LUT size affects the area (more input means more wires), and
the speed which affects the performance of FPGAs.

 In the past, some devices were created using a mixture of different


LUT sizes (e.g. 3-inputs and 4-inputs LUTs)
 Many studies in the past have been conducted to study the effect of
LUTs.
 All of the really successful architectures are currently based on the
use of 4-input LUTs.
© 2023 Dr. Ashraf Armoush , An-Najah National University 22
LUT vs. distributed RAM vs. Shift Register (SR)
The internal SRAM cells inside the LUT can offer a number of
interesting possibilities:

1. The primary role as a lookup table (LUT).

2. Some vendors allow the cells to be used


as a small block of RAM (16 X 1 RAM).
This is referred to as distributed RAM.

3. All of the FPGA’s configurations cells


(including LUT) are effectively strung
together in along chain. Therefore, some
vendors allow the SRAM cells forming a
LUT to be treated independently of main
body of the chain and to be used in the
form of shift register (SR).
© 2023 Dr. Ashraf Armoush , An-Najah National University 23

A Xilinx Logic Cell


• Each vender has its name for things.
• The core building block in a modern FPGA from Xilinx is called a logic
cell (LC).
• A logic cell (LC) contains:

1. a 4-input LUT (can also act as a 16X1


RAM or a 16-bit Shift register)
2. a Multiplexer
3. a register (acts as a flip-flop or as a latch)
4. Some special fast carry logic for use in
arithmetic

• The equivalent core building block in an FPGA from Altera is called a


logic element (LE).
• There are a number of differences between a Xilinx LC and an Altera
LE, but the overall concepts are very similar.
© 2023 Dr. Ashraf Armoush , An-Najah National University 24
Slicing
• The next step up the hierarchy is
what Xilinx calls a slice.

• Altera and the other vendors have


their own equivalent names.

• A slice contains two logic cells:

- Each logic cell has its own data


inputs and outputs.
- The slice has one set of clock,
clock enable, and set/reset
signal common to both logic
cells.

© 2023 Dr. Ashraf Armoush , An-Najah National University 25

CLB and LAB


• Moving one more level up the hierarchy, we come to the
configurable block.
• Xillinx calls such a block a configurable logic block (CLB), and Altera
refers to as a logic array block (LAB).
• Some Xilinx FPGAs have two slices, while the others have four in
each CLB.

© 2023 Dr. Ashraf Armoush , An-Najah National University 26


Embedded RAMs
• A lot of applications require the use of memory.
• FPGAs now include relatively large chunks of embedded RAM called
e-RAM or block RAM.
• Depending on the architecture of the component, these blocks might
be:
1. Positioned around the
periphery of the device.
2. Scattered across the face the
chip in relative isolation.
3. Or organized in columns.

• Each block of RAM can be used


independently, or multiple blocks
can be combined together to
construct larger blocks.

© 2023 Dr. Ashraf Armoush , An-Najah National University 27

Embedded Multipliers, Adders, MACs, etc.


• Some functions, like multipliers, are
inherently slow if they are
implemented by connecting a large
number of logic block together.
• Since these functions are required by a
lot of applications, many FPGAs
incorporate special hardwired
multiplier block typically located near
the e-RAM.
• Similarly, some FPGAs offer dedicated
adder blocks.
• One common operation in DSP
applications is called a multiply-and
accumulate (MAC). The function
multiplies two numbers and add the
result to a running total stored in an
accumulator.
• Some FPGAs provide entire MACs as
embedded functions.
© 2023 Dr. Ashraf Armoush , An-Najah National University 28
Embedded Processor Cores (Hard and Soft)
• Any portion of an electronic design can be realized in:
– Hardware(using logic gates and registers, etc.)
or
– Software (an instruction to be executed on a microprocessor)

• One of the main partitioning criteria is how fast you wish the functions to
perform their task:
– Picosecond and nanosecond logic: [has to run insanely fast]Hardware
– Microsecond logic: [reasonably fast]either in hardware or in software
– Millisecond logic: [it is a pain slowing the hardware down to implement such
slow function (e.g. using huge counters to generate delays)] (Software).

• In the past, discrete microprocessors on the circuit board were used to


execute the software for the required functions.
• Some FPGAs have become available that contain one or more embedded
microprocessors [called microprocessors cores]
 Save the cost of having two devices.
 Eliminate large number of tracks, pads, and pins. (makes the board smaller)

© 2023 Dr. Ashraf Armoush , An-Najah National University 29

Hard Microprocessor Cores


• A hard microprocessor core is implemented as a dedicated predefined block.
• Two approaches:
1. The first is to locate it in a strip to the side of the main FPGA fabric
2. The second is to embed one ore more cores directly into the main FPGA fabric.

© 2023 Dr. Ashraf Armoush , An-Najah National University 30


Hard Microprocessor Cores (cont)

© 2023 Dr. Ashraf Armoush , An-Najah National University 31

Soft Microprocessor Cores (Soft Cores)


• It is possible to configure a group of programmable logic
blocks to act as a microprocessor.

• Soft cores are simpler and slower than their hard-core


counterparts. [A soft core typically runs at 30% to 50 % of a hard core]

• You only need to implement a core if you need it and also you
can instantiate as many cores as you require until you run out
of resources (programmable logic blocks)

© 2023 Dr. Ashraf Armoush , An-Najah National University 32


Clock Trees
• All of the synchronous elements inside an FPGA (e.g. flip-flops) need to be
driven by a clock signal.
• Such a clock signal is typically comes into the FPGA via a special input pin.
• It is routed through the device and connected to the appropriate registers.
• This structure is used to ensure that all of the flip-flops see their clock as
close together as possible.
• The clock tree is
implemented using special
tracks and is separate from
the interconnects.
• In reality, multiple clock pins
are available. (unused clock
pins can be employed as
general-purpose I/O pins,
and there are multiple clock
trees inside the device.
© 2023 Dr. Ashraf Armoush , An-Najah National University 33

Clock Managers
• The input clock pin can be used to derive a special hard-wire function
(block) called a clock manager that generates a number of daughter clocks.
• The daughter clocks may be used to derive internal clock trees or external
output pins that can be used to provide clocking service to other devices.
• Each family of FPGAs has its own type of clock manager.

© 2023 Dr. Ashraf Armoush , An-Najah National University 34


Programming (Configuring) an FPGA
• Each FPGA vendor has its own unique terminology and it own
technology and protocols for doing things.

• Moreover, the detailed mechanisms for programming FPGAs


can vary on a family-by-family basic.

• The end result of all techniques is a configuration file.

• Configuration file (bit file): contains the information


(configuration bitstream) that will be uploaded into the FPGA
in order to program it to perform a specific function.

© 2023 Dr. Ashraf Armoush , An-Najah National University 35

Programming an FPGA (cont)


• In SRAM-Based FPGAs:
– we can visualize all of the SRAM configuration cells as comprising a single (long)
shift register.

© 2023 Dr. Ashraf Armoush , An-Najah National University 36


JTAG Port
• JTAG : Joint Test Action Group.
• Like many other modern devices, today’s FPGA are equipped with a
JTAG port to overcome test and programming challenges.
• JTAG was originally designed to implement the boundary scan
technique for testing circuit boards and ICs.
• Boundary scan is also widely used as a debugging method to
watch integrated circuit pin states, measure voltage, or analyze sub-
blocks inside an integrated circuit.

• FPGA has a number of pins that are used as a JTAG port. One of
these pins is used to input JTAG data, and another is used to output
that data.

• Each of remaining I/O pins has an associated JTAG register (a flip-


flop), where these registers are daisy-chained together.

© 2023 Dr. Ashraf Armoush , An-Najah National University 37

JTAG Port (cont.)

© 2023 Dr. Ashraf Armoush , An-Najah National University 38

You might also like