Main Porject Phase LL Content

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

CHAPTER 1

INTRODUCTION

The speed of multiplication operation is of great importance in digital


signal processing as well as in the general purpose processors today. In the past
multiplication was generally implemented via a sequence of addition, subtraction,
and shift operations. Multiplication can be considered as a series of repeated
additions. The number to be added is the multiplicand, the number of times that
it is added is the multiplier, and the result is the product. Each step of addition
generates a partial product. In most computers, the operand usually contains the
same number of bits.

When the operands are interpreted as integers, the product is generally


twice the length of operands in order to preserve the information content. This
repeated addition method that is suggested by the arithmetic definition is slow that
it is almost always replaced by an algorithm that makes use of positional
representation. It is possible to decompose multipliers into two parts. The first
part is dedicated to the generation of partial products, and the second one collects
and adds them. The basic multiplication principle is twofold i.e. evaluation of
partial products and accumulation of the shifted partial products. It is performed
by the successive additions of the columns of the shifted partial product
matrix. The „multiplier‟ is successfully shifted and gates the appropriate
bit of the „multiplicand‟. The delayed, gated instance of the multiplicand must
all be in the same column of the shifted partial product matrix. They are then
added to form the product bit for the particular form.

It follows that a circuit with a regular layout usually has shorter wires and
hence less wiring delay than a non-regular circuit. Therefore, if circuit delay
is estimated as the total gate delay, one should also have in minded the circuit‟s
size and amount of regularity, when comparing it to other circuits. “Delay”
1
usually refers to the “worst-case delay”. That is, if the delay of the output is
dependent on the inputs given, it is always the largest possible output delay that
sets the speed. Furthermore, if different bits in the output have different worst-
case delays, it is always the slowest bit that sets the delay for the whole output.
The slowest path between any input bit and any output bit is called the “critical
path”.

1.1. OBJECTIVE OF THE PROJECT


 In this, when performance of circuits is compared, it is always done in
terms of circuit speed, size and power. A good estimation of the circuit‟s
size is to countthe total number of gates used.

 The actual chip size of a circuit also depends on how the gates are placed on
the chip – the circuit‟s layout. Since we do not deal with layout in this
report, the only thing we can say about this is that regular circuits are
usually smaller than non-regular ones (for the same number of gates),
because regularity allows more compact layout.

 The physical delay of circuits originates from the small delays in single
gates, and from the wiring between them. The delay of a wire depends on
how long it is. Therefore, it is difficult to model the wiring delay; it requires
knowledge

 about the circuit‟s layout on the chip.

2
CHAPTER 2

LITERATURE SURVEY

John L. Gustafson and Isaac T. Yonemoto. “Beating Floating Point


at its Own Game: Posit Arithmetic”. In: Supercomputing Frontiers and
Innovations 4.2 (Apr. 2017),pp. 71–86.The paper introduced the concept of posit
numbers in 2017. Posit numbers borrow many concepts from its predecessor,
universal numbers introduced in 2015. The paper also discusses about application
of posit numbers in deep neural networks and presents studies about training deep
neural network on 8 bit posit numbers. The results are comparable to results
obtained through 16-bit floating point numbers. This work compares floating
point and posits numbers in depth with much useful visualization that show
advantages of posit numbers over floating point. For same bit-width, posit
numbers are shown to have larger dynamic range and precision around zero. Posit
numbers are also shown to give exact answers for a larger range or numbers for
addition.

David Goldberg. “What Every Computer Scientist Should Know about


Floating-Point Arithmetic”. In: ACM Comput. Surv. 23.1 (Mar.1991), pp. 5–
48.The work is a highly influential paper from 1991 that tried to make inner
workings of floating-point numbers known to everyone. This paper presents a
tutorial on the aspects of floating-point that have a direct impact on designers of
computer systems. It begins with background on floating-point representation and
rounding error, continues with a discussion of the IEEE floating point
standard, and concludes with examples of how computer system builders can
better support floating point.

Feibao Xiao et al. “Posit Arithmetic Hardware Implementations


with the Minimum Cost Divider and Square Root”. In: Electronics 9.10
(2020) in proposes several hardware implementations for posit arithmetic units
3
and contain they include Posit adder/subtractor, multiplier, divider, and square
root. Implementation was done on a Xilinx Virtex-7 FPGA VC709 platform. To
reduce circuit area, instead of using a Leading Zero Detector (LZD) and a
Leading Ones Detector (LOD), the authors choose to implement only a Leading
Zeros Detector and argues that implementing both LZD and LOD is suboptimal.
The authors also use a modular architecture which includes a encoder, calculator,
and a decoder module. They conclude paper remarking that the LZD design could
have been better.

Rohit Chaurasiya et al. “Parameterized Posit Arithmetic Hardware


Generator”. In: 2018 IEEE 36th International Conference on Computer
Design (ICCD). 2018, pp. 334–341.the authors present an architecture of a
parameterized Posit Arithmetic Unit generator that can generate adders and
multipliers of any bit-width pre-synthesis. They synthesize generated arithmetic
units using the parameterized generator for 8-bit, 16-bit, and 32-bit adders and
multipliers and compare them with IEEE 754-2008 compliant adders and
multipliers. In their comparison of m-bit units with n-bit IEEE 754-2008
compliant units, it is observed that the area and energy of an adder and multiplier
are comparable to their IEEE 754-2008 compliant counterparts where m = n.

Sugandha Tiwari et al. “PERI: A Configurable Posit Enabled RISC-V


Core”. In: ACM Trans. Archit. Code Optim. 18.3 (Apr. 2021).issn: 1544-
3566 is about how the Single-Precision Floating Point (“F”) extension of RISC-V
can be leveraged to support posit arithmetic. They also present the implementation
details of a parameterized and feature-complete posit Floating Point Unit. The posit FPU has been
integrated with the RISC-V compliant SHAKTI C-class core as an execution unit.

Yohann Uguen, Luc Forget, and Florent de Dinechin. “Evaluating the


Hardware Cost of the Posit Number System”. In: 2019 29th International
Conference on Field Programmable Logic and Applications (FPL). 2019, pp.
106–113 enable application-level evaluations of the posit system that include
4
performance and resource consumption. To this purpose, this work introduces an
open-source hardware implementation of the posit number system, in the form of
a C++ templatized library compatible .This library currently implements addition,
subtraction and multiplication for custom-size posits. In addition, the posit
standard also mandates the presence of the “quire”, a large accumulator able to
perform exact sums of products. The proposed library includes the first open-
source parameterized hardware quire. However, this work concluded that the 32
bit posit adders and multipliers are much larger and slower than the corresponding
floating point operators.

5
CHAPTER 3

EXISTING SYSTEM

Currently the working method of posit multiplier uses the modified booth‟s
algorithm technique. This method is also known as bit pair algorithm or radix-4
algorithm. There is a possibility to decrement the partial products number. Here
we do not shift and add for all the columns of the multiplier and later doing
multiplication with 0 or 1. But what we do here is to multiply every 2nd column
with 0 or ±1 or ±2. Both the methods yield similar results. Radix-4 booth encoder
compares 3 bits at a time which is also known as the overlapping method. By
adding a zero the left end of the number, we start pairing them into a batch of 3
numbers together shown in Figure

Fig.3.1.Pairing of 3numbers

Operating procedure for Radix-4 booth encoder is as shown in the table.


The results achieved from multiplying the different multiplier states with 0, ±1 and
±2 are described.

Radix-4 Booth encoding procedure is as follows: (1) Make sure n is even,


so if necessary create an extension using the sign bit. (2) Adjoin zero value to
the LSB in our multiplier. (3) Depending upon each and every value, we form
partial products as -y, +y, -2y, +2y or 0. Two‟s complement procedure is done to
deal with negative values. After shifting the multiplier y bit one by one,
multiplication process is thus proceeded. We obtain partial produced reduced
6
by twice its size which is a main advantage. This reduction facilitates the
decreased delay in propagation while the circuit is operating. The main
disadvantage of the circuit mentionable must be the difficulty in the construction
of the circuit hardware.

3.1. DEMERITS OF EXISTING SYSTEM

 It consumes over large space and area for its calculation,


say, image processing, it heads to complete the process in
a large space and requires more area .

 There occurs the time complexion where it supposes extra


time for its output leaving some appropriate issues behind
the output.

7
CHAPTER 4

PROPOSED SYSTEM

The design differs from the existing method in the mantissa multiplier. The
mantissa multiplier uses the Modified booth multiplication, consisting of a (nb -
es) bit mantissa multiplier. While doing multiplication, we do not always require a
maximum bit-width mantissa multiplier. That is, there is no need to always use
nb – es mantissa unit. Generally, the bits which are not used in multiplier and
multiplicand will be assigned to zero normally. But those bits will be reverted to
value one during recoding the multiplier when it is negative. Thus it results in an
unwanted signal toggling.

The unnecessary signal toggling must be avoided to reduce power


consumption. Though we use the same radix-4 booth multiplication in the
proposed system, the power efficiency can be achieved by splitting the multipliers
into smaller ones and accessing or controlling it through a control signal. Such
that this design also involves generating a control signal to enable the required
smaller multiplier component when required during our run time.

8
CHAPTER 5

DESIGN OF THE SYSTEM

The physical delay of circuits originates from the small delays in single
gates, and from the wiring between them. The delay of a wire depends on how
long it is. Therefore, it is difficult to model the wiring delay; it requires
knowledgeabout the circuit‟s layout on the chip.

The gate delay, however, can easily be modeled by saying that the output is
delayed a constant amount of time from the latest input. What we can say about
the wiring delay is that larger circuits have longer wires, and hence more wiring
delay. It follows that a circuit with a regular layout usually has shorter wires and
hence less wiring delay than a non-regular circuit. Therefore, if circuit delay is
estimated as the total gate delay, one should also have in minded the
circuit‟s size and amount of regularity, when comparing it to other circuits.
“Delay” usually refers tothe “worst-case delay”.

That is, if the delay of the output is dependent on the inputs given, it is
always the largest possible output delay that sets the speed. Furthermore, if
different bits in the output have different worst-case delays, it is always the
slowest bit that sets the delay for the whole output. The slowest path between any
input bit and any output bit is called the “critical path”

5.1 BASICS OF MULTIPLIER

Multiplication is a mathematical operation that at its simplest is an


abbreviated process of adding an integer to itself a specified number of times. A
number (multiplicand) is added to itself a number of times as specified by another
number (multiplier) to form a result (product). In elementary school, students
learn to multiply by placing the multiplicand on top of the multiplier. The
multiplicand is then multiplied by each digit of the multiplier beginning with
9
the rightmost, Least Significant Digit (LSD). Intermediate results (partial
products) are placed one atop the other, offset by one digit to align digits of the
same weight. The final product is determined by summation of all the partial-
products. Although most people think of multiplication only in base 10, this
technique applies equally to any base, including binary. Figure.1 shows the data
flow for the basic multiplication technique just described. Each black dot
represents a single digit.

Here, we assume that MSB represent the sign of the digit. The operation of
multiplication is rather simple in digital electronics. It has its origin from the
classical algorithm for the product of two binary numbers. This algorithm uses
addition and shift left operations to calculate the product of two numbers. Based
upon the above procedure, we can deduce an algorithm for any kind of
multiplication which is shown in Figure. We can check at the initial stage also that
whether the product will be positive or negative or after getting the whole result,
MSB of the results tells the sign of the product.

Fig.5.1. Multiplication algorithm

10
5.2 BINARY MULTIPLICATION

In the binary number system the digits, called bits, are limited to the set [0,
1]. The result of multiplying any binary number by a single binary bit is either
0, or the original number. This makes forming the intermediate partial- products
simple and efficient. Summing these partial-products is the time consuming task
for binary multipliers. One logical approach is to form the partial- products one at
a time and sum them as they are generated. Often implemented by software on
processors that do not have a hardware multiplier, this technique works fine, but
is slow because at least one machine cycle is required to sum each additional
partial-product. For applications where this approach does not provide enough
performance, multipliers can be implemented directly in hardware. The two main
categories of binary multiplication include signed and unsigned numbers. Digit
multiplication is a series of bit shifts and series of bit additions, where the two
numbers, the multiplicand and the multiplier are combined into the result.
Considering the bit representation of the multiplicand x = xn-1…..x1 x0 and the
multiplier y = yn-1…..y1y0 in order to form the product up to n shifted copies of
the multiplicand are to be added for unsigned multiplication.

Multiplication Process:

The simplest multiplication operation is to directly calculate the product of


two numbers by hand. This procedure can be divided into three steps: partial
product generation, partial product reduction and the final addition. To further
specify the operation process, let us calculate the product of 2 two‟s complement
numbers, for example, 11012 (−310) and 01012 (510), when computing the
product by hand, which can be described according to Figure.3.1

The first operand is called the multiplicand and the second the multiplier.
The intermediate products are called partial products and the final result is called
the product. However, the multiplication process, when this method is directly
11
mapped to hardware. As can been seen in the Figures, the multiplication operation
in hardware consists of PP generation, PP reduction and final addition steps.
The two rows before the product are called sum and carry bits. The operation of
this method is to take one of the multiplier bits at a time from right to left,
multiplying the multiplicand by the single bit of the multiplier and shifting the
intermediate product one position to the left of the earlier intermediate products
All the bits of the partial products in each column are added to obtain two bits:
sum and carry. Finally, the sum and carry bits in each column have to be summed.
Similarly, for the multiplication of an n-bit multiplicand and an m-bit multiplier, a
product with n + m bits long and m partial products can be generated. The method
shown in Figure.3 is also called a non-Booth encoding scheme.

5.3 HIGH-SPEED BOOTH ENCODED PARALLEL MULTIPLIER


DESIGN

Fast multipliers are essential parts of digital signal processing systems.


The speed of multiply operation is of great importance in digital signal processing
as well as in the general purpose processors today, especially since the media
processing took off. In the past multiplication was generally implemented via a
sequence of addition, subtraction, and shift operations. Multiplication can be
considered as a series of repeated additions. The number to be added is the
multiplicand, the number of times that it is added is the multiplier, and the result
is the product. Each step of addition generates a partial product. In most
computers, the operand usually contains the same number of bits.

When the operands are interpreted as integers, the product is generally


twice the length of operands in order to preserve the information content. This
repeated addition method that is suggested by the arithmetic definition is slow that
it is almost always replaced by an algorithm that makes use of positional
representation. It is possible to decompose multipliers into two parts. The first
part is dedicated to the generation of partial products, and the second one collects
12
and adds them. Posits, a new data-type modeled to build as an alternative for
IEEE floating point number. Posit number do not demand any operand to be of
varying size (variables) because if they find any answer to be wrong, they do the
rounding process. This behavior is very much unlike the universal numbers or
also known as Unum. The posit system yields many benefits, which may include
their dynamic range being large, use of a simple hardware execution system,
handling of exceptional cases are comparatively better, being greater in terms of
accuracy. Posits do not take the values of infinity and zero. As enunciated before,
posit processing system occupies less equipment compared to IEEE floats
processing machine.

Additionally, they have a irregular or inconsistent distribution of data that


settles the need in certain applications, for example it may be useful in deep
learning. The 8-bit or 16-bit posits are generally utilized in deep learning
applications. 32-Bit posits arithmetic may be utilized scientifically computing
fields. The generalized pattern for the posit datatype or number system is shown
in Figure. Posit (nb, es) is the representation of posits where nb means total or
absolute bit width and es means the bit width foe exponent. It constitutes 4 parts:

sign-s

regime-rg

exponent-exp

mantissafrac.

The component width of the bit is variable. The regime values are always
varying. The rest of the positions for the bits shall be taken by mantissa and also
the exponent. This happens only when the regime does not reserve all the
positions. A numeral illustrated in the format for posits arithmetic

13
Fig.5.3.1. Posit arithmetic

We have dynamic ranging bit width here as shown in above, exemption


being the sign bit sized 1. The regime as well as the sign bit should always be
present in the general structure. Exponent and mantissa sections occur if there any
leftover positions for the bits. As such the fraction part which includes the implicit
bit also has a bit-width ranging from 1 to nb – es value. Every time we do any
operations, the value of mantissa need not be to its full extent always. Using a
maximum bit multiplier all the times results in unwanted consumption of power.
We establish the successful implementing of a 16– bit posit multiplier, where the
mantissa (fraction) multiplier is split into several small ones and use them when
required. Thus we efficiently design a low power consuming posit multiplier and
reduce increasing power consumption. This architecture can also be used in
various multipliers other than posit multipliers to improve power efficiency of the
corresponding component or device.

14
CHAPTER 6

REQUIREMENT SPECIFICATION

6.1 Hardware Requirements:

• System : Pentium IV 2.4 GHz.

• Hard Disk : 40 GB.

• Floppy Drive : 1.44 Mb.

• Monitor : 15 VGA Colour.

• Mouse : Logitech.

• RAM : 256 Mb.

6.2 Software Requirements:

• Operating system : Cent OS.

• Tool : Model sim and Xilinx ISE.

6.3 Non-functional Requirement:

In systems engineering and requirements engineering, a non-functional


requirement is a requirement that specifies criteria that can be used to judge the
operation of a system, rather than specific behaviors. They are contrasted with
functional requirements that define specific behavior or functions. Non-functional
14 requirements add tremendous value to business analysis. It is commonly
misunderstood by a lot of people. It is important for business stakeholders, and
Clients to clearly explain the requirements and their expectations in measurable
terms. If the non-functional requirements are not measurable then they should be
revised or rewritten to gain better clarity. For example, User stories help in
mitigating the gap between developers and the user community in Agile
Methodology.

15
Usability:

Prioritize the important functions of the system based on usage patterns.


Frequently used functions should be tested for usability, as should complex and
critical functions. Be sure to create a requirement for this.

Reliability:

Reliability defines the trust in the system that is developed after using it for
a period of time. It defines the likeability of the software to work without failure
for a given time period.

The number of bugs in the code, hardware failures, and problems can
reduce the reliability of the software.

Create a requirement that data created in the system will be retained for a
number of years without the data being changed by the system.

It’s a good idea to also include requirements that make it easier to monitor
system performance.

Performance:

What should system response times be, as measured from any point, under
what circumstances?

Are there specific peak times when the load on the system will be unusually
high?

Think of stress periods, for example, at the end of the month or in


conjunction with payroll disbursement.

16
CHAPTER 7

SYSTEM SPECIFICATION

7.1 MODEL SIM:

Modelsim is a hardware simulation and debug environment primarily


targeted at smaller ASIC and FPGA design. Modelsim combines simulation
performance and capacity with the code coverage and debugging capabilities
required to simulate multiple blocks and systems and attain ASIC gate-level sign-
off. Comprehensive support of Verilog, SystemVerilog for Design, VHDL, and
SystemC provide a solid foundation for single and multi-language design
verification environments. Modelsim easy to use and unified debug and
simulation environment provide today‟s FPGA designers both the advanced
capabilities that they are growing to need and the environment that makes their
work productive.

Modelsim is a verification and simulation tool for VHDL, Verilog,


SystemVerilog,and mixed language design.

7.1.1 BASIC SIMULATION FLOW

Fig 7.1.1.1. Basic Simulation Flow

17
Creating the Working Library. In Modelsim, all designs are compiled into
a library. You typically start a new simulation in Modelsim by creating a
working library called "work," which is the default library name used by the
compiler as thedefault destination for compiled design units.

Compiling Your Design. After creating the working library, you compile
your design units into it. The Modelsim library format is compatible across all
supported platforms. You can simulate your design on any platform without
having to recompile your design.

Loading the Simulator with Your Design and Running the Simulation. With
the design compiled, you load the simulator with your design by invoking the
simulator on a top-level module (Verilog) or a configuration or entity/architecture
pair (VHDL). Assuming the design loads successfully, the simulation time is set
to zero, and you enter a run command to begin simulation.

Debugging Your Results. If you don‟t get the results you expect, you can
use Modelsim robust debugging environment to track down the cause of the
problem.

7.1.2 PROJECT FLOW:

A project is a collection mechanism for an HDL design under specification


or test. Even though you don‟t have to use projects in Modelsim, they may ease
interaction with the tool and are useful for organizing files and specifying
simulation settings. The following diagram shows the basic steps for simulating a
design within a Modelsim project.

18
Fig.7.1.2.1. Project Flow

The flow is similar to the basic simulation flow. However, there are two
importantdifferences:

• You do not have to create a working library in the project flow; it is done
foryou automatically.

• Projects are persistent. In other words, they will open every time you
invokeModelsim unless you specifically close them.

7.1.3 MULTIPLE LIBRARY FLOW:

The diagram above shows the basic steps for simulating with multiple
libraries. Modelsim uses libraries in two ways:

As a local working library that contains the compiled version of your


design; As a resource library. The contents of your working library will change as
you update your design and recompile.
19
Fig.7.1.3.1.Multiple library flow

A resource library is typically static and serves as a parts source for your
design. You can create your own resource libraries, or they may be supplied by
another design team or a third party (e.g., a silicon vendor). You specify which
resource libraries will be used when the design is compiled, and there are rules to
specify in which order they are searched. A common example of using both a
working library and a resource library is one where your gate-level design and test
bench are compiled into the working library, and the design references gate-level
models in a separate resource library.

You can also link to resource libraries from within a project. If you are
using a project, you would replace the first step above with these two steps: create
the project and add the test bench to the project.

20
Working steps for modelsim:

1. Click the modelsim icon.

2. File -> New -> project -> browse the project location ->enter the project
name.(The project created with the name what you have created).

3. Right click the workspace window to add the project file to project
(Rightclick -

 add -> add existing file -> add the files (by browse)).

4. Right click the workspace -> compile -> compile all.

5. Select library (below in workspace), open the work package and double
click thefile tobe compiled for simulation process).

6. After simulating that file, it will create the simulation window (as sim).

7. Right click in that simulation window (by placing the cursor which has
beensimulated with entity name) and go to add -> add all to wave.

8. In waveform right click the operand and force the value. For clock input,
right clickthe clock operand and give the value (by clicking clock in the
option).

9. Finally run the waveform and view the response.

7.2 XILINX ISE

Xilinx ISE (Integrated Software Environment) is a software tool produced


by Xilinx for synthesis and analysis of HDL designs, enabling the developer to
synthesize ("compile") their designs, perform timing analysis, examine RTL
diagrams, simulate a design's reaction to different stimuli, and configure the target
device with the programmer. Xilinx ISE provides the HDL and schematic editors,

21
logic synthesizer, fitter, and bit stream generator software. The XS tools from
XESS provide utilities for downloading the bit stream into the FPGA on the XSA
Board.

Xilinx constantly innovates to make sure the power challenges associated


with shrinking technologies can be overcome. Xilinx understands that FPGA
power consumption is one of the biggest concerns of FPGA users. Xilinx Power
Tools help perform power estimation and analysis for a given design. Power
estimation and analysis become even more important as FPGAs increase in logic
capacity and performance by migrating to smaller process geometries.

Total power in an FPGA is the sum of two components:

Static power - Static power results primarily from transistor leakage current
in the device. Leakage current is either from source-to-drain or through the gate
oxide, and exists even when the transistor is logically “OFF”.

Dynamic power - Dynamic power is associated with design activity and


switching events in the core or I/O of the device. Dynamic power is determined
by node capacitance, supply voltage, and switching frequency.

The accuracy of the Xilinx Power Tools depends on two primary components:

 Device data models and device characterization integrated into the tool.

 Inputs accurately entered by the user into the tools.

When using the Xilinx Power Tools, XPE is used for pre-design power
estimation and XPA is used for post-implementation design power optimization.
Since the Xilinx Power Tools cover different stages of the design flow, the tools
can be used for:

22
 Part selection

 Board design

 System reliability

 Power consumption estimation for a specific design

The Power Tools can be used for power optimization as well.. You can then
find tradeoffs to design for power. This can also be coupled with power
optimization algorithms available in synthesis and implementation within ISE.

Fig.7.2.1..Block diagram

7.2.1 XILINX WORKING STEPS:

1. Click the Xilinx icon.

2. Click File menu -> Select New project -> Then browse the project location
(which you have already created) -> then write the project name -> click
the next option.

3. In that Property Name -> select the family -> select the device -> select the

23
package -> then select the speed -> finally click the next option.

4. In create a new project window click the next option.

5. In add existing Project window click the next option.

6. Then click final option (It will create the project source).

7. In that sources workspace right click and choose add copy of source. And
selectwhat are the files you need -> then give open -> then give ok.

8. Then select the file which you want -> right click -> set as top module.

9. Then go to process window, synthesis it (right click run).

10. Then for pin assignment select user constraints -> assign package pins ->
then give the input and output pins. Finally save it. (For editing select edit
constraints, then edit it if need)

11. Then go to generate programming files -> run the configure devices
(IMPACT)option.

(Power ON the FPGA Kit)

Then select the finish option, choose the bit file program. (That bit file adds
to the device). Finally right click the device, select program -> click ok (it‟ll fetch
to the FPGA kit through the port).

24
7.3 CODING

library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;

entity MULTIPLIER is
port(
clk : in std_logic;
IN1 : in std_logic_vector (15 downto 0);
IN2 : in std_logic_vector (15 downto 0);
C : out std_logic_vector(31 downto 0));
end MULTIPLIER;

architecture BEH of MULTIPLIER is

component PPG_0_0
port(
clk : in std_logic;
X : in std_logic_vector (3 downto 0);
Y : in std_logic_vector (3 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_0_1
port(
clk : in std_logic;
X : in std_logic_vector (3 downto 0);
Y : in std_logic_vector (7 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_0_2
port(
clk : in std_logic;
X : in std_logic_vector (3 downto 0);
Y : in std_logic_vector (11 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_0_3
port(
clk : in std_logic;
X : in std_logic_vector (3 downto 0);
Y : in std_logic_vector (15 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_1_0
port(
clk : in std_logic;
X : in std_logic_vector (7 downto 0);
Y : in std_logic_vector (3 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_1_1
port(
25
clk : in std_logic;
X : in std_logic_vector (7 downto 0);
Y : in std_logic_vector (7 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_1_2
port(
clk : in std_logic;
X : in std_logic_vector (7 downto 0);
Y : in std_logic_vector (11 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_1_3
port(
clk : in std_logic;
X : in std_logic_vector (7 downto 0);
Y : in std_logic_vector (15 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_2_0
port(
clk : in std_logic;
X : in std_logic_vector (11 downto 0);
Y : in std_logic_vector (3 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_2_1
port(
clk : in std_logic;
X : in std_logic_vector (11 downto 0);
Y : in std_logic_vector (7 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_2_2
port(
clk : in std_logic;
X : in std_logic_vector (11 downto 0);
Y : in std_logic_vector (11 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_2_3
port(
clk : in std_logic;
X : in std_logic_vector (11 downto 0);
Y : in std_logic_vector (15 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_3_0
port(
clk : in std_logic;
X : in std_logic_vector (15 downto 0);
Y : in std_logic_vector (3 downto 0);
26
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_3_1
port(
clk : in std_logic;
X : in std_logic_vector (15 downto 0);
Y : in std_logic_vector (7 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_3_2
port(
clk : in std_logic;
X : in std_logic_vector (15 downto 0);
Y : in std_logic_vector (11 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

component PPG_3_3
port(
clk : in std_logic;
X : in std_logic_vector (15 downto 0);
Y : in std_logic_vector (15 downto 0);
OUT11 : out std_logic_vector(31 downto 0));
end component;

signal X,Y : std_logic_vector(15 downto 0);

signal PPG00 : std_logic_vector(31 downto 0);


signal PPG01 : std_logic_vector(31 downto 0);
signal PPG02 : std_logic_vector(31 downto 0);
signal PPG03 : std_logic_vector(31 downto 0);
signal PPG10 : std_logic_vector(31 downto 0);
signal PPG11 : std_logic_vector(31 downto 0);
signal PPG12 : std_logic_vector(31 downto 0);
signal PPG13 : std_logic_vector(31 downto 0);
signal PPG20 : std_logic_vector(31 downto 0);
signal PPG21 : std_logic_vector(31 downto 0);
signal PPG22 : std_logic_vector(31 downto 0);
signal PPG23 : std_logic_vector(31 downto 0);
signal PPG30 : std_logic_vector(31 downto 0);
signal PPG31 : std_logic_vector(31 downto 0);
signal PPG32 : std_logic_vector(31 downto 0);
signal PPG33 : std_logic_vector(31 downto 0);

signal CTLx,CTLy: std_logic_vector(1 downto 0);

begin

process(IN1,IN2,CTLx,CTLy,X,Y)

begin

if ((IN1(15 downto 12)/="0000") and (IN1(11 downto


0)="000000000000")) then

CTLx <= "00";


X(15 downto 12) <= IN1(15 downto 12);
27
elsif ((IN1(15 downto 8)/="00000000") and (IN1(7 downto
0)="00000000")) then

CTLx <= "01";


X(15 downto 8) <= IN1(15 downto 8);

elsif ((IN1(15 downto 4)/="000000000000") and (IN1(3


downto 0)="0000")) then

CTLx <= "10";


X(15 downto 4) <= IN1(15 downto 4);

else

CTLx <= "11";


X(15 downto 0) <= IN1(15 downto 0);

end if;

if ((IN2(15 downto 12)/="0000") and (IN2(11 downto


0)="000000000000")) then

CTLy <= "00";


Y(15 downto 12) <= IN2(15 downto 12);

elsif ((IN2(15 downto 8)/="00000000") and (IN2(7 downto


0)="00000000")) then

CTLy <= "01";


Y(15 downto 8) <= IN2(15 downto 8);

elsif ((IN2(15 downto 4)/="000000000000") and (IN2(3


downto 0)="0000")) then

CTLy <= "10";


Y(15 downto 4) <= IN2(15 downto 4);

else

CTLy <= "11";


Y(15 downto 0) <= IN2(15 downto 0);

end if;

end process;

x00 : PPG_0_0 port map (clk,X(15 downto 12),Y(15 downto


12),PPG00);

x01 : PPG_0_1 port map (clk,X(15 downto 12),Y(15 downto 8)


,PPG01);

x02 : PPG_0_2 port map (clk,X(15 downto 12),Y(15 downto 4)


,PPG02);

x03 : PPG_0_3 port map (clk,X(15 downto 12),Y,PPG03);

28
x10 : PPG_1_0 port map (clk,X(15 downto 8) ,Y(15 downto
12),PPG10);

x11 : PPG_1_1 port map (clk,X(15 downto 8) ,Y(15 downto 8)


,PPG11);

x12 : PPG_1_2 port map (clk,X(15 downto 8) ,Y(15 downto 4)


,PPG12);

x13 : PPG_1_3 port map (clk,X(15 downto 8) ,Y,PPG13);

x20 : PPG_2_0 port map (clk,X(15 downto 4) ,Y(15 downto


12),PPG20);

x21 : PPG_2_1 port map (clk,X(15 downto 4) ,Y(15 downto 8)


,PPG21);

x22 : PPG_2_2 port map (clk,X(15 downto 4) ,Y(15 downto 4)


,PPG22);

x23 : PPG_2_3 port map (clk,X(15 downto 4) ,Y,PPG23);

x30 : PPG_3_0 port map (clk,X,Y(15 downto 12),PPG30);

x31 : PPG_3_1 port map (clk,X,Y(15 downto 8) ,PPG31);

x32 : PPG_3_2 port map (clk,X,Y(15 downto 4) ,PPG32);

x33 : PPG_3_3 port map (clk,X,Y,PPG33);

process(CTLx,CTLy,PPG00,PPG01,PPG02,PPG03,PPG10,PPG11,PPG12,PPG13,PPG20,P
PG21,PPG22,PPG23,PPG30,PPG31,PPG32,PPG33)

begin

if ((CTLx="00") and (CTLy="00")) then

C <= PPG00;

elsif ((CTLx="00") and (CTLy="01")) then

C <= PPG01;

elsif ((CTLx="00") and (CTLy="10")) then

C <= PPG02;

elsif ((CTLx="00") and (CTLy="11")) then

C <= PPG03;

elsif ((CTLx="01") and (CTLy="00")) then

C <= PPG10;

elsif ((CTLx="01") and (CTLy="01")) then

C <= PPG11;
29
elsif ((CTLx="01") and (CTLy="10")) then

C <= PPG12;

elsif ((CTLx="01") and (CTLy="11")) then

C <= PPG13;

elsif ((CTLx="10") and (CTLy="00")) then

C <= PPG20;

elsif ((CTLx="10") and (CTLy="01")) then

C <= PPG21;

elsif ((CTLx="10") and (CTLy="10")) then

C <= PPG22;

elsif ((CTLx="10") and (CTLy="11")) then

C <= PPG23;

elsif ((CTLx="11") and (CTLy="00")) then

C <= PPG30;

elsif ((CTLx="11") and (CTLy="01")) then

C <= PPG31;

elsif ((CTLx="11") and (CTLy="10")) then

C <= PPG32;

elsif ((CTLx="11") and (CTLy="11")) then

C <= PPG33;

end if;

end process;

end BEH;

30
CHAPTER 8

FEASIBILITY ANALYSIS

8.1 FEASIBILITY STUDY:

The feasibility of the project is analyzed in this phase and business proposal
is put forth with a very general plan for the project and some cost estimates.
During system analysis the feasibility study of the proposed system is to be
carried out. This is to ensure that the proposed system is not a burden to the
company. For feasibility analysis, some understanding of the major requirements
for the system is essential.

Three key considerations involved in the feasibility analysis are

 ECONOMICAL FEASIBILITY

 TECHNICAL FEASIBILITY

 SOCIAL FEASIBILITY

ECONOMICAL FEASIBILITY:

This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be
justified. Thus the developed system as well within the budget and this was
achieved because most of the technologies used are freely available. Only the
customized products had to be purchased.

TECHNICAL FEASIBILITY:

This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on
31
the available technical resources. This will lead to high demands being placed on
the client. The developed system must have a modest requirement, as only
minimal or null changes are required for implementing this system.

SOCIAL FEASIBILITY:

The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently.
The user must not feel threatened by the system, instead must accept it as a
necessity. The level of acceptance by the users solely depends on the methods that
are employed to educate the user about the system and to make him familiar with
it. His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.

32
CHAPTER 9

SYSTEM TESTING

The purpose of testing is to discover errors. Testing is the process of trying


to discover every conceivable fault or weakness in a work product. It provides a
way to check the functionality of components, sub assemblies, assemblies and/or
a finished product It is the process of exercising software with the intent of
ensuring that the Software system meets its requirements and user expectations
and does not fail in an unacceptable manner. There are various types of test. Each
test type addresses a specific testing requirement.

9.1 Unit Testing:

Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.

Test strategy and approach

Field testing will be performed manually and functional tests will be


written in detail.

Test objectives

 All field entries must work properly.

 Pages must be activated from the identified link.

 The entry screen, messages and responses must not be delayed.

33
Features to be tested

 Verify that the entries are of the correct format

 No duplicate entries should be allowed

 All links should take the user to the correct page.

9.2 Integration Testing

Software integration testing is the incremental integration testing of two or


more integrated software components on a single platform to produce failures
caused by interface defects.

The task of the integration test is to check that components or software


applications, e.g. components in a software system or – one step up – software
applications at the company level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No
defects encountered.

9.3 Acceptance Testing

User Acceptance Testing is a critical phase of any project and requires


significant participation by the end user. It also ensures that the system meets the
functional requirements.

Test Results: All the test cases mentioned above passed successfully. No
defects encountered.

34
CHAPTER 10

APPLICATION

 The main application of using posit multiplier is to implement its


algorithm in Image Processing techniques.

 Posit number system has been used as an alternative to IEEE


floating point number system in many applications, especially the
recentpopular deep learning.

 The posit logarithm is also get implemented in the Approximate


multiplier that tend to perform operations which possess a low
mean error distance.

10.1 Image Multiplication

To multiply two images, use the immultiply function. immultiply does an


element-by-element multiplication (.*) of each corresponding pixel in a pair of
input images and returns the product of these multiplications in the corresponding
pixel in an output image.

Image multiplication by a constant, referred to as scaling, is a common


image processing operation. When used with a scaling factor greater than one,
scaling brightens an image; a factor less than one darkens an image. Scaling
generally produces a much more natural brightening/darkening effect than simply
adding an offset to the pixels, since it preserves the relative contrast of the image
better. For example, this code scales an image by a constant factor.

35
 I = imread('moon.tif');

 J = immultiply(I,1.2);

 imshow(I);

 figure, imshow(J)

Fig : 10.1 Image multiplication

10.2 Image Processing

Image processing is referred to processing of a 2D picture by a computer.


Basic definitions:

An image defined in the “real world” is considered to be a function of two


real variables, for example, a(x,y) with a as the amplitude (e.g. brightness) of the
image at the real coordinate position (x,y).

Modern digital technology has made it possible to manipulate multi-

36
dimensional signals with systems that range from simple digital circuits to
advanced parallel computers. The goal of this manipulation can be divided into
three categories:

Image Processing (image in -> image out)

Image Analysis (image in -> measurements out)

Image Understanding (image in -> high-level description out)

Purpose of Image processing

The purpose of image processing is divided into 5 groups. They are:

1. Visualization - Observe the objects that are not visible.

2. Image sharpening and restoration - To create a better image.

3. Image retrieval - Seek for the image of interest.

4. Measurement of pattern – Measures various objects in an image.

5. Image Recognition – Distinguish the objects in an image.

Fundamental Steps in Digital Image Processing

37
Fig : 10.2 Fundamental Steps

Important steps involved in digital image processing

Fig : 10.3 Important steps

38
CHAPTER 11

OUTPUT

Fig. 11.1 Signal Indication

11.2. Covered Area of Gate Counts

39
11.3. Power Generated

40
CHAPTER 12

FUTURE SCOPE

The proposed method is evaluated with 8-bit, 16-bit, and 32-bit posit
multiplier and an average of 16% power reduction can be achieved. The proposed
method is suitable to be used in any low power posit arithmetic unit designs.

In the future, more power reduction opportunity in the posit multiplier


architecture will be explored. In addition, the investigation of power efficient
posit arithmetic unit design will be extended to posit adder and posit multiply-
accumulate unit.

41
CHAPTER 13

CONCLUSION

The idea proposed in the paper is a power efficient posit multiplier


architecture. Motivated by the fact that the whole mantissa multiplier in a posit
multiplier is not always fully required, the proposed design divides the mantissa
multiplier into small portions. At run-time, only the required portions are enabled
to avoid unnecessary signal toggling to reduce the power consumption. Whether
to enable a multiplier portion is controlled by the regime bit-width generated
during component extraction. The proposed method is evaluated with 8-bit, 16-
bit, and 32-bit posit multiplier and an average of 16% power reduction can be
achieved. The proposed method is suitable to be used in any low power posit
arithmetic unit designs. In the future, more power reduction opportunity in the
posit multiplier architecture will be explored. In addition, the investigation of
power efficient posit arithmetic unit design will be extended to posit adder and
posit multiply-accumulate unit.

42
REFERENCES

[1] J. L. Gustafson and I. Yonemoto, “Beating floating point at its own game:
Posit arithmetic,” Supercomput. Front. Innovat. Int. J., vol. 4, no. 2, pp. 71–86,
Jun. 2017.

[2] IEEE Standard for Floating-Point Arithmetic, IEEE Standard 754-2008, Aug.
23, 2008, pp. 1–70.

[3] Z. Carmichael, S. H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson,


and

D. Kudithipudi, “Deep positron: A deep neural network using the posit number
system,” CoRR, vol. abs/1812.01762, pp. 1–6, Dec. 2018.

[4] J. Johnson, “Rethinking floating point for deep learning,” CoRR, vol.
abs/1811.01721, pp. 1–8, Nov. 2018.

[5] M. Klöwer, P. D. Düben, and T. N. Palmer, “Posits as an alternative to floats


for weather and climate models,” in Proc. Conf. Next Gener. Arithmetic, Mar.
2019, pp. 1–8.

[6] R. Chaurasiya et al., “Parameterized posit arithmetic hardware generator,” in


Proc. IEEE 36th Int. Conf. Comput. Design (ICCD), Orlando, FL, USA, Oct.
2018,pp. 334–341.

[7] M. K. Jaiswal and H.-K. So, “Architecture generator for type-3 unum posit

adder/subtractor,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), Florence,


Italy,May 2018, pp. 1–5.

[8] H. Zhang, J. He, and S.-B. Ko, “Efficient posit multiply-accumulate unit
generator for deep learning applications,” in Proc. IEEE Int. Symp. Circuits Syst.
(ISCAS), Sapporo, Japan, May 2019, pp. 1–5.
43
[9] A. Podobas and S. Matsuoka, “Hardware implementation of POSITs and their
application in FPGAs,” in Proc. IEEE Int. Parallel Distrib. Process. Symp.
Workshops (IPDPSW), Vancouver, BC, Canada, May 2018, pp. 138–145.

[10] A. D. Booth, “A signed binary multiplication technique,” Quart. J. Mech.


Appl. Math., vol. 4, no. 2, pp. 236–240, 1951.

[11] SoftPosit-Python. Accessed: Oct. 2018. [Online]. Available:


https://posithub.org/docs/PositTutorial_Part1.html

[12] Z. Carmichael, H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson, and

D. Kudithipudi, “Performance-efficiency trade-off of low-precision numerical


formats in deep neural networks,” in Proc. Conf. Next Gener. Arithmetic, Mar.
2019, pp. 1–9

44

You might also like