Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

9/17/2019

OpenFPGA: An Opensource Framework Enabling


OpenFPGA: Context
Rapid Prototyping of Customizable FPGAs
Xifan Tang, Edouard Giacomin, Aurélien Alacchi • FPGAs’ ever-increasing role in modern computing systems
Baudouin Chauvière and Pierre-Emmanuel Gaillardon
Laboratory of NanoIntegrated Systems (LNIS) – University of Utah, U.S.A • Prototyping FPGAs is traditionally a cumbersome process
– Considerable manual layout (Groups of hardware engineers)
– Ad-hoc design tool development (Groups of software engineers)
– Year-long development cycles
Full-Custom Flow

• OpenFPGA: The First open-source FPGA IP generator


– Enable a rapid prototyping flow for FPGA IPs (the fabric)
– Customizable FPGA architecture and instant bitstream support
Automated Semi-Custom approach

• Benefit to industry (eFPGA or specific application field)

• Benefit to academia (2x smaller and 3x faster than state-of-the-art)


09/11/2019 – Barcelona, Spain
University of Utah | Aurélien Alacchi | 1 University of Utah | Aurélien Alacchi | 2
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

OpenFPGA: How is it different? OpenFPGA: Flow


Automate FPGA development using a semi-custom design approach
• VTR (Verilog To Routing): Verilog Benchmark XML-based
Extended XML-based Standard/
FPGA Architecture Customized
– Architecture exploration tool Description Cell Library
Synthesis
Yosys
FPGA-X2P
• SymbiFlow: VPR

– Open source flow to generate bitstream from Verilog FPGA-Bitstream FPGA-Verilog


– Target commercial FPGA Verification Prototyping Flow Sign-off Flow
Flow Bitstream
&Testbench Fabric
SDC STA
• OpenFPGA: Verilog
HDL Formal
– Customizable architecture Simulator Tool Back-end tool
– Bitstream related to the specified architecture Functionality Formal GDSII Layout Performance
Verification Verification of FPGA Fabric Sign-off
Each tool has its specific application
University of Utah | Aurélien Alacchi | 3 University of Utah | Aurélien Alacchi | 4
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Enriched Circuit Level Description


• VPR • OpenFPGA
<architecture> <architecture>
<models> <models>
<layout> <layout>
<device> <openfpga_settings>
XML Hardware Description Extension <switchlist> <parameters>
<segmentlist> <module_circuit_models>
<complexblocklist> <device>
<pb_type> <cblocks>
<mode> <switchlist>
Compared hierarchy <interconnect> <segmentlist>
Low level customization <power> <complexblocklist>
<clocks> <pb_type>
Basic element composition <mode>
<interconnect>
Multimode support <power>
<clocks>

University of Utah | Aurélien Alacchi | 5 University of Utah | Aurélien Alacchi | 6


© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

1
9/17/2019

Circuit Level Customization Composition of Basic Elements


XML definition for a MUX to be auto-generated in Verilog • Multiplexers can have one-level, multi-level or tree-like
<circuit_model type=“mux” name=“mux_1lvl”> structure
<design_technology type=“cmos” structure=“one-level”/> tapdrive4
<input_buffer exist=“on” circuit_model_name=“inv1x”/>
<output_buffer exist=“off”/>

</circuit_model> inv1x
tgate

XML definition for a FF to


be mapped to standard
or customized cell
<circuit_model type=“ff” name=“dff” <circuit_model type="mux" name="mux 1level" prefix="mux 1level">
verilog_netlist=“ff.v”> <design_technology type="cmos" structure="one-level"/>
<port name=“D” size=”1”/> <input_buffer exist="on" circuit_model_name="inv1x"/>
<port name=“CLK” size=”1”/> <output_buffer exist="on" circuit_model_name="tapdrive4"/>
<port name=“Q” size=”1”/> <pass_gate_logic circuit_model_name="tgate"/>
… <port_type="input" prefix="in" size="4"/>
</circuit_model> <port_type="output" prefix="out" size="1"/>
<port_type="sram" prefix="sram" size="4"/>
University of Utah | Aurélien Alacchi | 7 </circuit_model> University of Utah | Aurélien Alacchi | 8
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Multimode support overview Multimode definition in XML


Example of 6-modes BLE with adders and fracturable LUTs
• Physical mode
Cin
Cin
Cin REGin
REGin (a) BLE Physical
REGin (a)BLE Physical
(a) Physical
in5
in5
in5
in4
in4 Implementation
Implementation
Implementation
in4
in0
<mode name="fle_phy" disabled_in_packing="true">
in0
in0 M
in1
in1
in1 4-LUT
4-LUT M
M
in2
in2 4-LUT UU MM
M
in2 U
in3
in3
in3
+
X
XX
FF
FF
FF
UU
U
XX
X
out0
out0 <pb_type name="frac_logic" num_pb="1">
L
ports definition
L
L
U
U L
LL CLK
CLK
CLK
4-LUT U
4-LUT T
T U
U
U
<pb_type name="frac_lut_6" blif_model=".frac_lut6" mode_bits="11" num_pb="1"
T
T
TT

circuit_model_name=”my_lut6">
4-LUT
4-LUT
MM
M
ports definition

MM
M
+ L
L
UU
U
XX
X
UU
U out1
out1
out1
L FF
FF
FF XX
X
U
U
4-LUT
4-LUT CLK
CLK
CLK
T
T
T
Cout
Cout REGout
REGout
REGout

in6
in6
in7
in7
2-LUT
2-LUT L
L
• Operating mode
U
U
U

2-LUT
2-LUT
T
T
T
L
L
L
<mode name="n1_lut6">
U
U
U
T
T
T
<pb_type name="ble6" num_pb="1">
2-LUT
2-LUT L
L
L ports definition
<pb_type name="lut6" blif_model=".names" num_pb="1" class="lut" mode_bits="00"
U
U
U
T
T
T
2-LUT
2-LUT
2-LUT
physical_pb_type_name="frac_lut_6" >
• Separated XML description for different modes of a CLB ports definition

• Verilog and Bitstream will be automatically generated based on the
physical implementation University of Utah | Aurélien Alacchi | 9 University of Utah | Aurélien Alacchi | 10
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Experimental Methodology

• Technology: Commercial 40 nm
• Homogeneous FPGA Architecture:
– Fs = 3
– Fc, in = 0.055 Fc, out = 0.1
Backend
– 1 Tile = 1 CLB + 1 SB + 2 CB
– 1 CLB = 10 FLE
– 1 FLE = fract (LUT6 + LUT4) + 2* 1bit adder +2 * FF
Experimental methodology – Channel width = 300: 83% length-4 + 13% length-16
Backend flow & Sign-off • Chip utilization rate = 80%
Verification • Baseline:
– Current state of the art [1]
Cell Optimization
– StratixIV + QuartusII
[1] B. Grady et al., Synthesizable Heterogeneous FPGA Fabrics, IEEE International Conference on FPT, 2018, pp. 1-8.

University of Utah | Aurélien Alacchi | 11 University of Utah | Aurélien Alacchi | 12


© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

2
9/17/2019

Backend Flow & Performance Sign-off OpenFPGA: Verification Methods


Verilog XML • Functional Verification
benchmark architecture
• Programmed FPGA
Yosys • Pristine FPGA
• Formal Verification
FPGA_x2p • 100% fault coverage on
implementation

FPGA Fabric
Bitstream
Verilog netlist

Programmed FPGA
Verilog netlist

HDL Simulator

Formal Simulator
University of Utah | Aurélien Alacchi | 13 University of Utah | Aurélien Alacchi | 14
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Cell Optimization
• Cell library has a
strong impact on the
FPGA metrics: 90%
of the FPGA area is
made of CCFF and
multiplexers Analysis
• Tgate-based MUX2
is 2.5x smaller than
from the SC library Area
Timing
• CCFF is 1.8x smaller
than from the SC
MUX2 Configuration-Chain FF (CCFF)
library

Few custom cells in a semi-custom flow can make the difference


University of Utah | Aurélien Alacchi | 15 University of Utah | Aurélien Alacchi | 16
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Area Analysis Timing Analysis


• Post-layout evaluation using a commercial 40nm node and a Stratix IV-like • Post-layout evaluation using a commercial 40nm node and a Stratix IV-like
architecture architecture
• Flow runs in 24h • Flow runs in 24h
• Using an optimized cell library (multiplexers, CCFF): • Using an optimized cell library (multiplexers, CCFF):
– 1.8x area reduction – 3x delay reduction Path Type Previous OpenFPGA OpenFPGA Stratix IV
Delay (ns) work [1] (TT) (SS)

100% 0.46 0.14 0.26 0.27


30625 5-LUT
-11% (+70%) (-48%) (-3%) (100%)
m2
80% -42% 0.5 0.15 0.27 0.28
6-LUT
27160 (+78%) (-46%) (-3%) (100%)
60% m2
0.7 0.54 1.00 0.77
17648 +60% 1-bit Adder
(-9%) (-30%) (+30%) (100%)
40% m2
11050 1.63 1.10 2.12 1.23
m2 20-bit Adder
20% (+32%) (-6%) (+72%) (100%)
0.27 0.12 0.23 0.17
Local Routing
0% (+58%) (-30%) (+35%) (100%)
PreviousWork
work OpenFPGA+Opt. OpenFPGA
OpenFPGA + Stratix-IV
Previous OpenFPGA Stratix IV
2:1 MUX (1/2-lvl MUXes)
Opt. Cells 2.53 0.40 0.75 0.59
[9]
[1] (Std. Cells) L-4 track
+ Opt. Cells (+328%) (-32%) (+27%) (100%)
20x20 FPGA 20x20 FPGA
4.02 0.78 1.55 1.02
L-16 track
[1] B. Grady et al., Synthesizable Heterogeneous FPGA Fabrics, IEEE International Conference on FPT, 2018, pp. 1-8. (+294%) (-23%) (+52%) (100%)
[1] B. Grady et al., Synthesizable Heterogeneous FPGA Fabrics, IEEE International Conference on FPT, 2018, pp. 1-8.
University of Utah | Aurélien Alacchi | 17 University of Utah | Aurélien Alacchi | 18
© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

3
9/17/2019

Where to progress?
• MCNC big20 suite benchmarks
– Average of 35% critical path reduction compare to previous work
– Average of 45% (30 + 15) critical path overhead compare to Stratix-IV
– Gap between academic and commercial EDA tools
100%
90% Conclusion
80%
70%
60%
50%
40%
30%
20%
10%
0%
ex5p alu4 frisc apex4 elliptic pdc ex1010 misex3 seq spla
Previous work [9]
[1] OpenFPGA + 1/2-lvl MUXes + Opt.Cells Stratix-IV

Critical paths comparison


[1] B. Grady et al., Synthesizable Heterogeneous FPGA Fabrics, IEEE International Conference on FPT, 2018, pp. 1-8.

University of Utah | Aurélien Alacchi | 19 University of Utah | Aurélien Alacchi | 20


© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

OpenFPGA: Summary Acknowledgment


• Fully functional XML to Prototype flow (FPGA-X2P) supporting This material is based on research sponsored by Air Force Research
homogenous multi-mode FPGA fabrics
Laboratory (AFRL) and Defense Advanced Research Projects Agency
– XML-to-Verilog generator
(DARPA) under agreement number FA8650-18-2-7855. The U.S.
– Verilog-to-Bitstream generator
Government is authorized to reproduce and distribute reprints for
– Verilog testbenches for functionality/formal validation Governmental purposes notwithstanding any copyright notation
thereon. The views and conclusions contained herein are those of the
• Automatic backend flow for homogenous FPGA authors and should not be interpreted as necessarily representing the
– 20×20 FPGA layout using a commercial 40nm node in 24h official policies or endorsements, either expressed or implied, of Air
Force Research Laboratory (AFRL) and Defense Advanced Research
– 2× area and 3× performance improvement over previous arts
Projects Agency (DARPA) or the U.S. Government.
• OpenFPGA alpha release with tutorials
– Github repository: https://github.com/LNIS-Projects/OpenFPGA
– Online documentation: https://openfpga.readthedocs.io/en/master/ Pierre-Emmanuel Gaillardon and Xifan Tang have financial interests in
the company ReRouting LLC, which manufactures RRAM-based
systems and provides engineering service.

University of Utah | Aurélien Alacchi | 21 University of Utah | Aurélien Alacchi | 22


© INS-UoU 2018 All rights reserved © INS-UoU 2018 All rights reserved

Thank you for your attention


Questions?

Laboratory for NanoIntegrated Systems


Department of Electrical and Computer Engineering
MEB building – University of Utah – Salt Lake City – UT – USA
University of Utah | Aurélien Alacchi | 23
© INS-UoU 2018 All rights reserved

You might also like