Circuit Net

SCIENCE CHINA
Information Sciences
. NEWS & VIEWS .
CircuitNet: An Open-Source Dataset for Machine

Learning Applications in Electronic Design
Automation (EDA)
arXiv:2208.01040v4 [cs.LG] 1 Sep 2022
Zhuomin Chai1, 2 , Yuxiang Zhao1 , Yibo Lin1* , Wei Liu2 , Runsheng Wang1 & Ru Huang1
1
School of Integrated Circuits, Peking Univeristy, Beijing 100871, China;
2
School of Physics and Technology, Wuhan 430072, China
Citation Zhuomin Chai, Yuxiang Zhao, Yibo Lin, et al. CircuitNet: An Open-Source Dataset for Machine
Learning Applications in Electronic Design Automation (EDA). Sci China Inf Sci
The electronic design automation (EDA) commu- Despite the active research on ML for CAD,
nity has been actively exploring machine learn- there remain some challenges in this field. There
ing (ML) for very large-scale integrated computer- is almost no public dataset dedicated to ML for
aided design (VLSI CAD). Many studies explored CAD applications because of license restrictions
learning-based techniques for cross-stage predic- and domain-specific expertise for data generation.
tion tasks in the design flow to achieve faster de- Meanwhile, the existing datasets obtained from
sign convergence. Although building ML mod- CAD contests are often incomplete and not de-
els usually requires a large amount of data, most signed for ML applications [2]. The lack of pub-
studies can only generate small internal datasets lic datasets raises challenges such as difficulty
for validation because of the lack of large public in benchmarking and reproducing previous work,
datasets. In this essay, we present the first open- limited research scope from limited data access,
source dataset called CircuitNet for ML tasks in and a high bar for new researchers, which slows
VLSI CAD. down further advancements in this field. To this
Introduction. VLSI circuit design can be divided end, we present the first open-source dataset,
into front-end design and back-end design. The CircuitNet, which provides holistic support for
front-end design implements the functionality of cross-stage prediction tasks in back-end design
the circuit, and then, the back-end design trans- with diverse samples.
forms the circuit into manufacturable geometries, Dataset overview. The statistics of the dataset
i.e., layouts. In advanced technology nodes, the are summarized in Figure 1. We followed two steps
back-end design is time-consuming because of iter- to generate the dataset: data collection and fea-
ative information feed-forward and feed-backward ture extraction.
between the design stages during optimization. To Data collection consisted of two stages: logic
accelerate this process, cross-stage prediction was synthesis and physical design. In logic synthesis,
introduced to replace the original long feedback the RISC-V designs were mapped from register
loops between design stages with local loops within transfer level (RTL) designs to gate-level netlists
the design stages. As a promising method for fast in the 28 nm technology node with Synopsys De-
and accurate cross-stage prediction, ML has been sign Compiler. Then, the physical design trans-
explored for various early-stage prediction tasks formed the netlists into layouts with Cadence In-
in the design flow, including routability and IR novus. We improved the diversity of the dataset
drop [1]. by introducing different settings in logic synthe-
* Corresponding author (email: yibolin@pku.edu.cn)
Zhuomin Chai, et al. Sci China Inf Sci 2
Extraction Stage Features Prediction Tasks

Gate-level
Synthesis
Netlist
Floorplan Macro Region Congestion

Netlist Statistics Synthesis Variations Prediction
Design Cell Density
Cell Area Frequency RUDY&
#Cells #Nets #Macros Congestion
(µm2 ) (MHz) Placement RUDY
RISCY-a 44836 80287 65739 Routability
Pin Configuration Features
RISCY-FPU-a 61677 106429 75985 3/4/5
zero-riscy-a 35017 67472 58631 Global All
Congestion
50/200/500 Routing
RISCY-b 30207 58452 69779
Detailed DRC Violations
RISCY-FPU-b 47130 84676 80030 13/14/15 DRC Violations
Routing Prediction
zero-riscy-b 20350 45599 62648
Physical Design Variations Instance Power
Utilizations #Macro #Power Mesh IR Drop
Filler Insertion Clock Tree Signal Arrival Features
(%) Placement Setting Synthesis Timing Window
After Placement IR Drop
70/75/80/85/90 3 8 IR Drop Prediction
/After Routing
(a) (b)
Figure 1 (a) Statistics of designs and variations introduced during data collection. (b) Available features, their extraction
stages, and prediction tasks in the experiments.
sis and physical design, as shown in Figure 1(a). pixel-level accuracy. The corresponding results for
These settings contributed to variations in utiliza- an FCN based method [3] were 0.040 and 0.80, re-
tion, routing resources, macro locations, etc., respectively. Second, for DRC violation prediction,
flecting diverse situations in the back-end design we considered the area under the curve (AUC) of
flow. Each design has 2160 settings, and all the the receiver operating characteristic (ROC) curve
designs have 12,960 runs of the back-end design and that of the precision-recall (PR) curve as the
flow. Eventually, we obtained 10,242 layouts after metrics for imbalanced learning. The correspond-
excluding the failed runs. ing results for an FCN based method [4] were 0.95
In feature extraction, features were extracted at and 0.63, respectively. Finally, for IR drop pre-
various design stages to support different cross- diction, we evaluated the AUC of the ROC curve
stage prediction tasks, as shown in Figure 1(b). and that of the PR curve. The corresponding re-
We included both graph-like features (i.e., gate- sults for a U-Net based method [5] were 0.94 and
level netlists) and image-like features (i.e., two- 0.83. Overall, our results are relatively consistent
dimensional feature maps extracted from the phys- with the original publications and demonstrate the
ical layouts, as the design information can be nat- effectiveness of CircuitNet.
urally represented by image-like data by dividing a Usage. We separated the features shown in Fig-
layout into tiles and regarding each tile as a pixel). ure 1(b) and stored them in different directories to
These features are widely adopted in the state-of- enable custom applications. We provided scripts
the-art (SOTA) routability and IR drop prediction for preprocessing and combining different features
models [3–5]. for training and testing used in the above experi-
Dataset evaluation. To evaluate the effective- ments as references.
ness of CircuitNet, we further conducted experi- Access methods. The user guide and the down-
ments on three prediction tasks: congestion, de- load link for CircuitNet can be accessed from
sign rule check (DRC) violation, and IR drop. https://circuitnet.github.io.
Each experiment adopted a method from recent
References
studies [3–5] and evaluated its result on Circuit-
1 G. Huang, et al. “Machine Learning for Electronic
Net with the same evaluation metrics as in the Design Automation: A Survey,” ACM TODAES, vol.
original studies. These methods utilized image-like 26, no. 5, 1–46, 2021.
features to train a generative model, such as fully 2 I. S. Bustany, et al. “ISPD 2015 Benchmarks with
convolutional networks (FCNs) and U-Net, that Fence Regions and Routing Blockages for Detailed-
Routing-Driven Placement,” in ISPD 2015. 157–164.
formulating the prediction task into an image-to- 3 S. Liu, et al. “Global Placement with Deep Learning-
image translation task. A detailed manual about Enabled Explicit Routability Optimization,” in DATE
the setup of these experiments is available on our 2021. 1821–1824.
webpage. Herein, we briefly introduce our results. 4 Z. Xie, et al. “RouteNet: Routability prediction
for mixed-size designs using convolutional neural net-
First, for congestion prediction, we used the work,” in ICCAD 2018. 1–8.
normalized root-mean-square-error and structural 5 V. A. Chhabria, et al. “MAVIREC: ML-Aided
similarity index measure as metrics to evaluate Vectored IR-Drop Estimation and Classification,” in
DATE 2021. 1825–1828.

Circuit Net

Uploaded by

Copyright:

Available Formats

You might also like

Circuit Net

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Circuit Net

Uploaded by

Copyright:

Available Formats

SCIENCE CHINA

CircuitNet: An Open-Source Dataset for Machine

Extraction Stage Features Prediction Tasks

Floorplan Macro Region Congestion

You might also like