FPGA Implementation of Educational RISC - V Processor Suitable For Embedded Applications

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2023 International Conference on Electrical, Computer and Communication Engineering (ECCE)

FPGA Implementation of Educational RISC-V


Processor Suitable for Embedded Applications
Md. Hasanul Banna Saif Nahin Ul Sadad Md. Nazrul Islam Mondal
2023 International Conference on Electrical, Computer and Communication Engineering (ECCE) | 979-8-3503-4536-0/23/$31.00 ©2023 IEEE | DOI: 10.1109/ECCE57851.2023.10101508

Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering
Rajshahi University of Engineering & Technology Rajshahi University of Engineering & Technology Rajshahi University of Engineering & Technology
Rajshahi-6204, Bangladesh Rajshahi-6204, Bangladesh Rajshahi-6204, Bangladesh
saifbanna79@gmail.com nahinsd100@gmail.com nimbd109@gmail.com

Abstract—Learning computer architecture has been a tough CPUs and dynamic pipelined CPUs using field programmable
task for the learners. Sometimes it is not very effective when an gate arrays (FPGAs) in their article.
architecture is learnt theoretically. Learning an architecture like Sulk et al. in [5] published design of an 8-bit RISC
RISC-V which has huge industrial interest has an additional
benefit. The RISC-V instruction set architecture that is free microcontroller using Handel-C. Carpinelli [6] described a
to use was developed with flexibility and durability in mind. Java-based emulator that facilitates understanding how an 8-
The adaptability of RISC-V enables a wide range of hardware bit processor obtains, decodes, and executes instructions. The
implementations. Interests in Field Programmable Gate Array simulator puts together assembly language instructions to run
(FPGA) is growing in recent times due to its feasibility, flexibility a cycle-accurate simulation of the processor.
and efficiency. In this paper, we implemented an educational
processor based on the RISC-V architecture on FPGA with McGrew et al. [7] aimed is to implement a CPU on
support for interfacing intended for embedded applications. An FPGA which is based on the 32-bit RISC-V architecture.
assembler was also designed and developed which can convert The design specification, analysis, and simulation are all key
assembly codes into RISC-V standard machine codes that can components in determining how the overall system design will
help the users to operate the CPU easily. be developed.
Index Terms—FPGA, RISC-V, CPU Design, Assembler
All the references mentioned above except the last one
focused on architectures other than RISC-V architecture. We
I. LITERATURE REVIEW
used RISC-V architecture for building our CPU because RISC-
Computer Architecture is one of the most significant courses V architecture is an open standard Instruction Set Architecture
in Computer Engineering discipline. Our traditional computer (ISA) and is free for everyone to implement & use unlike many
architecture course provides only theoretical knowledge. Such proprietary architectures like x86-64 and ARM architecture.
course typically uses pen and paper to cover theoretical mate- Although some references had implemented assembler, none
rial and at best they demonstrate how computers are structured of them had support for interfacing.
using architectural simulators. It would be helpful for learners The contribution of our paper is to develop an educational
if they could learn it by practical experiments. 32-bit RISC-V processor on FPGA that is simple to understand
Field Programmable Gate Array (FPGA) can be an ideal and will support assembler and interfacing with external I/O
choice due to its soft programmability. Learners can get devices for embedded applications that will help the learners
the perfect environment to develop and practice computer to learn about embedded applications of CPU. The goal of this
architecture in a variety of unique and imaginative ways thanks paper is three tiered: (1) To actualize the theoretical knowledge
to the combination of HDL and FPGA [1]. This paper’s with practical experiments, (2) To design an assembler & (3)
purpose is to teach learners how to design CPU using FPGA Interfacing the CPU. Our implementation is compared with
in a simpler way. references in later section.
Lee et al. [1] targets the undergrad students to educate com-
II. CPU D ESIGN
puter architecture via hands-on learning in their research paper.
They showed a five-stage pipelined 32-bit MIPS processor. A. Proposed Model
One of their key goals in developing the project was to allevi- There are three major tasks in our proposed model. First
ate the workload on the pupils. Li et al. [2] proposed an 8-bit one is to build a 32-bit CPU. The next one is to make
pipelined processor’s design and implementation. The design assembler & the last task is to do interfacing with input-
of the processor, functional simulations, implementation of the output registers A 32-bit RISC-V processor is constructed in
design, and other tasks were assigned to the students. the initial stage using Verilog HDL. This processor is created
Yıldız et al. [3] introduced the Very Simple CPU (VSCPU), and implemented on a Xilinx Spartan 6 LX16 FPGA Board
a straightforward and adaptable soft CPU that can be quickly using Verilog HDL. Verilog RTL codes are then fed into
constructed on FPGAs with a full toolchain that includes an FPGA Design Flow. We simulated our design on GTKWave
assembler, instruction set simulator, and C compiler. In this to verify its functionality. The second stage involves getting
paper, no. of instructions is 16 and it is a multi-cycle CPU. our FPGA board ready to test the interfacing code with our
Qin et al. [4] outlined the design process for static pipelined CPU design. The interface assembly code is written and given

979-8-3503-4536-0/23/$31.00 ©2023 IEEE


uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 24,2024 at 05:23:15 UTC from IEEE Xplore. Restrictions apply
to the assembler. The executable code is then produced by one line from those input lines according to the direction of
our assembler. This executable code is loaded into the RAM the control unit of the CPU.
module of computer from ROM on the FPGA board. The
proposed model is demonstrated in Figure 1.

Fig. 1. Proposed Design Model [8] [9] [10]


Fig. 3. ALU Circuit

B. RISC-V ISA Formats Our CPU supports a numbers of arithmetic and logical
As we have discussed before, we implemented the CPU operations as shown in Table I.
based on RISC-V 32I ISA as shown in Figure 2. We imple-
TABLE I
mented 4 types of instructions: Register type, Immediate type, O PERATIONS S UPPORTED BY THE ALU
Store Type and Branch Type. The ISA size is 32 bits where
each instruction is divided into several parts. First 7 bits are AluControl Operation Expression
used to represent the opcode in all the instructions. The opcode 0000 AND a&b
is different for different operations. 0001 OR a|b
0010 ADD a+b
In the R-type and I-type operations, 5 bits are used to 0100 SUB a–b
represent rd which indicates the destination register (where 1 if a <b
the data is written). Moreover, there are rs1 and rs2 which are 1000 Set on Less Than
also 5-bits long. They are used to select the register for reading 0 otherwise
0011 Shift Left a <<b
data. The func3 and func7 portions are used to be of different 0101 Shift Right a >>b
values for different types of ALU operations. Finally, the Imm 0110 MUL a*b
pin is used to show the immediate value which is 12-bits long. 0111 XOR a ˆb

2) Register File: The 32 general-purpose registers of the


CPU are kept in a building known as a register file as shown in
Figure 4. R-format instructions have three register operands.
Thus, two registers are selected for read operation and one
is selected for write operation for R-type of operations. Two
inputs are required to write a data word: one to provide the
data to be written into the register and one to designate the
register number to be written. The write enable pin reg wr en
controls writing operations and it must be set to 1 for a write
to happen at the positive clock edge. The data input and two
Fig. 2. RISC-V Instruction Formats [11] data output buses are each 32 bits wide and the register number
inputs, which specify one of 32 registers, are 5 bits wide.
3) Instruction and Data Memory: The main memory is
C. CPU Components divided into two parts. They are the data memory and the
We organized components of our CPU using simple struc- instruction memory. As the name suggests, the instruction
tures. The major components are described below. memory holds the 32-bits size instructions. When the CPU
1) Arithmatic & Logical Unit (ALU): The Arithmetic and starts working, the instruction memory first loads all the
Logic Unit (ALU) handles all basic calculations as shown in instructions from the ROM. The data memory contains the
Figure 3. The ALU mainly works as a multiplexer where all data. The data are primarily saved at the data memory.
the outputs of the operational circuits connect together as the 4) Program Counter: Program Counter contains the ad-
inputs of the multiplexer. The input line is common for all dress of the instruction being run. Program counter is increased
those operational circuits. Then finally the MUX selects only by 1 after every clock plus to point to next instruction.

uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 24,2024 at 05:23:15 UTC from IEEE Xplore. Restrictions apply
TABLE II
R EGISTER - TYPE A SSEMBLY C ODE F ORMATS

Instruction Name Assembly Example Description


AND AND R2, R5, R7 R2 ← R5 & R7
OR OR R3, R5, R2 R3 ← R5 | R2
ADD ADD R4, R1, R5 R4 ← R1 + R5
SUB SUB R1, R3, R7 R1 ← R3 – R7
Set on Less Than SLT R5, R4, R33 R5 ← R4 <R3 ? 1 : 0
Shift Left SHL R1, R2, R4 R1 ← R2 <<R4
Shift Right SHR R2, R3, R7 R2 ← R3 >>R7
MUL MUL R3, R1, R5 R3 ← R1 * R5
XOR XOR R1, R2, R5 R1 ← R2 ˆR5

Fig. 4. Register File Circuit IV. I NTERFACING

III. A SSEMBLER D ESIGN


In order to perform interfacing, two special registers called
Assembler transforms assembly language into machine code input register and output register are used for this interfacing
of RISC-V architecture which can be understood by RISC- purpose. The input register is used to take the input value
V processor. Three major parts of assemblers are: Machine from other devices such as sensor. This value is sent to the
codes, Assembler controls and Assembler directives. output register which is accessed by the output device such as
A. Flow of the Assembler 7-segment LED display. Figure 6 shows the basic interfacing
between the I/O devices and CPU.
In our case, we used Python 3.0 to implement the as-
sembler. The program first divides the whole instruction into
tokens. Then, from the first token it determines which type
of operation has to be performed. It detects the register
numbers for reading and writing data. It passes the parameters
from the tokens to the corresponding function to decode.
Finally, it generates the machine code according the assembly
instructions. The flowchart of the assembler is shown in figure Fig. 6. Interfacing with Input and Output Register
5.

AND/ OR The input register must be checked continuously if the input


/SUB etc. Find the rd, device gives it any value or not. If the input register gets a
Start rs1, rs2, func3,
func7 values value from the input device, rest of the instructions related to
interfacing should be executed otherwise it will continuously
Input file
Format those check for value into a loop. The flowchart is shown in Figure
values with 7.
Split into Detect
actual no. of bits
tokens Operation
Output the Start
instruction
Check
mode R-mode/ I-mode Input file
Finish
Send the value to
Go to MAIN
No the Output register
Fig. 5. Flow of the Assembler

Take the input from


Check if Input Register to R0
B. Assembly Code Formats
input found
Our customized assembler has its own style of assembly Reset R0 register
language format. There are four types of instructions. They are
register-type, immediate-type, load/store type and branching-
Yes
type.
The register-type instructions and their corresponding as- Fig. 7. Flowchart of Interfacing
sembly formats are shown in Table II

uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 24,2024 at 05:23:15 UTC from IEEE Xplore. Restrictions apply
V. S IMULATION R ESULTS & C OMPARISON
We simulated our design using GTKwave software to verify
our designs.
A. ALU Simulation
We implemented 9 operations for our CPU. The ALU-
Control pin selects any of the operations to be performed.
Testbench simulation for a multiplication operation of the
ALU circuit is shown in Figure 8.

Fig. 8. ALU Simulation

This simulation result shows that, when the inputs are 4


and 7, the ALUControl pin 0110 indicates the multiplication
operation (from Table I), the result is 28. As the output is not
0, the zero pin is inactive (logical 0).
Fig. 10. CPU Simulation (Register-type Instruction)
B. Register-File Simulation
The register file circuit is dependent on CLK. The read op-
eration does not care what the CLK status is, but while writing indicated by reg write dest (r2). As reg read addr 1 indicates
data to the registers, the CLK must be 1. The simulation is r0 which has the value 1 and reg read addr 2 indicates r1
shown in Figure 9. which has the value 1. These two values are added and
produces result 2. This 2 is stored at r2 which is indicated
by reg write dest.

D. Interfacing Simulation
The waveform of a simulation of the interfacing between
the CPU and input output registers is shown in Figure 11.

Fig. 9. Register File Simulation

Simulation shows that r0 register is being read as indicated


by reg read addr 1 whose stored data is 1 as indicated by
reg read data 1. Writing operation in register set is enabled
because reg write en is 1. As shown in figure, data with value
4 as indicated by reg write data is being written to r1 register
as indicated by reg write dest.
Fig. 11. Interfacing Simulation
C. CPU Simulation
We take an addition operation for this register type simu- The first instruction is a branching instruction which sets
lation. The assembly code for this instruction is “add r2, r0, the program counter to jump into the main function. A loop
r1”. The simulation is shown in Figure 10. is continuously checking if any input is found from the input
Here, the opcode, func7 and func3 values show that register as indicated by reg read addr 1. As soon as the input
this instruction is a register type add operation. The value is found, it jumps to the next section and accepts the input
from the register indicated by reg read addr 1 (r0) and which is 5 into the register R0 as indicated by reg write dest.
reg read addr 2 (r1) are added and stored in the register Then again sends that value to the output register.

uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 24,2024 at 05:23:15 UTC from IEEE Xplore. Restrictions apply
E. Comparison with Other Papers [11] “About RISC-V - RISC-V international,” RISC, 23-Aug-2021. [Online].
Available: https://riscv.org/about/. [Accessed: 08-Oct-2022].
The main goal of our paper was to develop an educational [12] “Chapter 8: ASSEMBLERS,” www.jklp.org.
CPU based on the RISC-V architecture which also supports http://www.jklp.org/profession/books/mix/c08.html
assembler and interfacing for embedded applications. Table III
shows a simple comparison among the mentioned papers and
our paper.

TABLE III
C OMPARISON WITH OTHER PAPERS

Ref. Paper CPU Assembler RISC-V or not Interfacing


[1], [2], [4], [5] No No No
[3], [6], Yes No No
[7] Yes Yes No
Our CPU Yes Yes Yes

VI. C ONCLUSION & F UTURE W ORKS


We implemented a 32-bit CPU based on RISC-V archi-
tecture and an assembler. Learners will understand about
hardware implementation utilizing Hardware Description Lan-
guage (HDL), RISC-V architecture, assembler design and
assembly language programming. We also implemented inter-
facing of the CPU which will help the learners to learn about
how the embedded system works.
In future, compilers can be added as an extension with this
work. It will help the user to operate the CPU using high level
languages. Moreover, operating system can be introduced for
better learning experience of the computer system.
R EFERENCES
[1] J. H. Lee, S. E. Lee, H. C. Yu and T. Suh, ”Pipelined CPU Design With
FPGA in Teaching Computer Architecture,” in IEEE Transactions on
Education, vol. 55, no. 3, pp. 341-348, Aug. 2012.
[2] Y. Li and W. Chu, “Aizup-a pipelined processor design and implemen-
tation on xilinx fpga chip,” in FPGAs for Custom Computing Machines,
1996. Proceedings. IEEE Symposium on. IEEE, 1996, pp. 98–106.
[3] A. Yıldız, H. F. Ugurdag, B. Aktemur, D. İskender and S. Gören, ”CPU
design simplified,” 2018 3rd International Conference on Computer
Science and Engineering (UBMK), 2018, pp. 630-632.
[4] G. Qin, Y. Hu, L. Huang and Y. Guo, ”Design and Performance
Analysis on Static and Dynamic Pipelined CPU in Course Experiment
of Computer Architecture,” 2018 13th International Conference on
Computer Science & Education (ICCSE), 2018, pp. 1-6.
[5] D. ˇ Sulık, M. Vasilko, and P. Fuchs, “Design of a risc microcontroller
core in 48 hours,” Journal of ELECTRICAL ENGINEERING, vol. 52,
no. 5-6, pp. 171–176, 2001.
[6] J. D. Carpinelli, “The very simple cpu simulator,” in Frontiers in
Education, 2002. FIE 2002. 32nd Annual, vol. 1. IEEE, 2002, pp.
T2F–T2F.
[7] T. McGrew, E. Schonauer and P. Jamieson, ”Framework and Tools
for Undergraduates Designing RISC-V Processors on an FPGA in
Computer Architecture Education,” 2019 International Conference on
Computational Science and Computational Intelligence (CSCI), 2019,
pp. 778-781.
[8] “Temperature Sensor Stock Illustrations – 5,951 Temperature
Sensor Stock Illustrations, Vectors & Clipart - Dreamstime,”
Temperature Sensor Stock Illustrations – 5,951 Temperature
Sensor Stock Illustrations, Vectors & Clipart - Dreamstime.
https://www.dreamstime.com/illustration/temperature-sensor.html
(accessed Oct. 11, 2022).
[9] Wikipedia Contributors, “Xilinx ISE,” Wikipedia, Sep. 30, 2019.
https://en.wikipedia.org/wiki/Xilinx ISE
[10] “Spartan-6 FPGA family,” Xilinx. [Online]. Available:
https://www.xilinx.com/products/silicon-devices/fpga/spartan-html.
[Accessed: 08-Oct-2022].

uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on January 24,2024 at 05:23:15 UTC from IEEE Xplore. Restrictions apply

You might also like