Professional Documents
Culture Documents
Beol Sim Survye
Beol Sim Survye
Abstract—Relentless semiconductor scaling and ever increasing to replicate the exact model of circuit behavior, the nonlinear
device integration have resulted in the exponentially growing circuit simulation, along with large parasitics network, may
size of the back-end design, which makes back-end simulation take days to complete and have become a bottleneck to the
very time- and resource-consuming. With the success in the
computer vision community, deep learning seems a promising verification [7]. To address the inefficiency issues in back-end
alternative to assist the back-end simulation. However, unlike simulators, researchers have proposed many physical-law or
computer vision tasks, most back-end simulation problems are heuristics based strategies from different modeling, ordering,
mathematically and physically well-defined, e.g., power delivery factorization, to solver, so as to handle the huge size of back-
network sign off and post-layout circuit simulation. It then brings end problems [7]–[9]. However, as the device non-linearity
broad interests in the community where and how to deploy deep
learning in the back-end simulation flows. This paper discusses continues grows and the number of parasitics exponentially
a few challenges that the deployment of deep learning models increases, the gains from many prior heuristics quickly fall
in back-end simulation have to confront and the corresponding behind the demands of the back-end simulations [10].
opportunities for future research.
On the other hand, the success of artificial intelligence (AI),
I. I NTRODUCTION especially deep learning, in the computer vision tasks, has
brought broad interests in applying AI to back-end problems.
Relentless semiconductor scaling and ever increasing SoC A recent example is Google’s Placer [11] using deep reinforce-
integration have imposed various challenges to the back- ment learning followed by heated discussions in both academia
end design from problem size to complexity [1]. In order and industry. It should be noted that, unlike many computer
to understand how the design behaves when implemented vision problems, most back-end simulation problems already
in physical layout, designers have to perform a back-end have a theoretically sound mathematical and physical law-
simulation from the extracted view. With the smaller geometry based formulations [6]. This then brings a natural question on
and increased number of transistors, back-end simulation has the scope and effectiveness that deep learning techniques may
observed fast growing circuit size and the exponential increase bring to the back-end simulations. Unlike many deep learning
in interconnect parasitics, which makes back-end simulation based works on back-end physical design, e.g., placement and
extremely time- and resource-consuming [2]. routing, which are found to a promising alternative, there
As an essential step to validate system performance or are actually limited deep learning based back-end simulation
ensure chip sign off, back-end simulations can be roughly that can achieve both efficiency and generality [12]–[17]. For
categorized into linear and nonlinear simulations. For example, example, a few prior works treated the power delivery noise
power delivery network (PDN) sign off is to solve a huge map as an image and used the convolutional neural network
sparse linear system in the end, i.e., Ax = b, where A (CNN) to estimate the IR drop [5], [12], [18]. While the
is the system matrix with a significantly large portion of execution of CNN is faster than the linear system solver,
zeros, b is the excitation vector, and x is the node voltage the training cost and generality are concerning and usually
vector to be solved [3]. In modern VLSI designs, such a limited to a particular scenario. Thus, although deep learning
linear system can easily have billions of nodes with various is effective in addressing the large scale and non-linearity in
sparsity patterns [4], incurring significant computational and back-end simulations, the existing challenges from training,
memory overheads. Moreover, due to the varying test vectors, model to generality still prevent its wide deployment in the
i.e., excitations, the power delivery linear system needs to be back-end simulation:
repeatedly simulated tens to even hundreds of times to ensure
power integrity, which makes power delivery sign off very • Training: The deep learning model can be dedicated to
time consuming [4], [5]. Post-layout circuit simulation is a a particular design and scenario. The underlying training
common nonlinear simulation example for mixed-signal and cost from data collection to setup is often non-negligible.
analog circuit verification [6]. Thanks to the growing parasitics Designers have to evaluate the implicit cost before the
impact, it has become a huge risk to simply simulate the deployment.
schematic design without the layout parasitics. However, as • Model: The back-end circuit or layout can be huge and
the SPICE simulation needs to go down to the transistor level redundant. Domain knowledge is highly desired to select
to compute the response to a time-varying input [5], [27]–
[29]. Thus, the network models have to rely on the regional
local model by dividing the global grids to smaller clips
and compute the dynamic noise of a clip within a particular
timing window [5], [27]. In addition to the analysis speed-
up, there are also a few papers that apply machine learning
to help power grid design and optimization [24], [30]–[34].
However, many of the prior works have to incorporate design-
dependent features such as location and timing information of
cells into power maps and hence make the solution dedicated
to a particular design.
Fig. 2. Transient analysis flow for post-layout transistor level circuit simulation.
B. Model Architecture
Feature extraction is a crucial part of neural network model
architecture. A well-designed model only contains the nec- Fig. 6. Encoding schemes for input switching and netlist used in Primal [16]
(The Figure is from [16]).
essary information to reduce the unnecessary computational
overhead through feature extraction and selection. Thanks to diction with only 4% change. Thus, proper feature extraction
the huge dimension of the netlists in back-end simulation, is critical to the efficiency of the deep learning model, but
the input to the deep learning model can be huge, e.g., always remain a challenging problem for different back-end
millions to billions, which incurs significant memory and time simulations.
consumption. It is then essential to control the input dimension
and only select the necessary features. Many prior works C. Model Generality
rely on the manually selected features to reduce the input The last challenge is the generality of the trained deep
size, which are always design-dependent and demand domain learning model. A robust deep learning model can be trans-
knowledge [13], [14], [17], [26]. Recently, there are also a ferred between different designs or even technologies, which
few works that combine the domain knowledge with encoding is crucial to the deployment of deep learning in the back-end
schemes to more adaptively select the desired features. For simulation. However, it is common in prior works that the
example, to reduce the input dimension while maintaining the learned model is dedicated to a particular set of inputs, design
essential information, PDN sign off can decompose the spatial and technology. If we categorized the learned features from
and temporal information as a pre-processing scheme [5], [12], the training set as specific and general features [49], where
[16]. The underlying PDN layout can be spatially divided into the specific features are design-dependent and the general
clips, the power map of which can then be further encoded features are universal, it is important to ensure the learned
to reduce the redundant information. Zhou, et al., explores model incorporate more general features than the specific
various feature encoding schemes to reduce the dimension features. Researchers have spent various efforts improving
of both input switching and circuit netlist [16]. As shown the learning of general features in back-end simulation. For
in Fig. 6, the input switching can be modelled using 1D or example, GRANNITE [50] adopted a graph neural network
2D encoding to capture the register switching trace, while (GNN) architecture to learn the general features from both
the netlist can be modelled using graph based encoding. RTL simulation trace and input netlist, which is abstracted
Fig. 5 presents the trade-off between the inference accuracy on as a graph to mitigate the design dependency. Chhabria, et
PDN noise prediction and model complexity through different al., proposed a UNet-like structure to learn the universal
feature extractions. It is observed that, even with 36% model relationship between the selected features and IR Drops,
complexity reduction, we can still maintain very accurate pre- instead of directly predicting IR drop [18]. For sparse matrix
TABLE I the underlying mathematical formulation and matrix solve,
C OMPARISON ON MODEL GENERALITY FOR DIFFERENT DESIGNS AND as the overhead of which can be non-trivial in comparison
SCENARIOS
to the other speed-up techniques, such as parallel execution.
D1 D2 On the other hand, it seems more promising to apply deep
S1 S2 S3 S1 S2 S3 learning to the design or metric modeling so as to reduce
Toggle Rate 20% 15% 25% 20% 15% 25% the back-end problem complexity, which demands domain
Accuracy 96.46% 92.74% 93.29% 68.06% 66.85% 64.28% knowledge to achieve an efficient model. Moreover, since
back-end simulation plays a very critical role in chip sign off,
the model fidelity is important to provide designers with high
computation, e.g. sparse LU factorization, sparse matrix-vector confidence in deploying the technique.
multiplication (SpMV), etc., that are used in both PDN sign
off and post-layout circuit simulation [9], [38], [51], references ACKNOWLEDGMENT
[52]–[55] suggested machine learning methods to select the This work was supported in part by Zhejiang Provincial Key
proper storage format and implementation kernels. While the R&D program (Grant No. 2020C01052), NSFC (Grant No.
model is designed for the underlying mathematical formulation 62034007 and 61974133) and Science Foundation of China
without design details, it is not a trivial task to collect sufficient University of Petroleum, Beijing (No. 2462020YXZZ024).
data and train such a model applicable to all the circuits.
Since the inherent characteristics of industrial-scale circuits R EFERENCES
are very complicated, the above techniques are more effective [1] C. K. Sarkar, Technology computer aided design: simulation for VLSI
when the design is with slight changes. We here use the MOSFET. CRC Press, 2013.
same IR drop prediction task and the network architecture in [2] B. Khailany, H. Ren, S. Dai, S. Godil, B. Keller, R. Kirby, A. Klinefelter,
R. Venkatesan, Y. Zhang, B. Catanzaro, and W. J. Dally, “Accelerating
Fig. 3 to demonstrate the problem of model generality. Table I chip design with machine learning,” IEEE Micro, vol. 40, no. 6, pp. 23–
summarizes the results, where D1 and D2 refer to two different 32, 2020.
power grid designs but supplying power to the same circuit [3] C. Zhuo, J. Hu, M. Zhao, and K. Chen, “Power grid analysis and op-
timization using algebraic multigrid,” IEEE Transactions on Computer-
netlist; S1, S2 and S3 refer to three different scenarios or test Aided Design of Integrated Circuits and Systems, vol. 27, no. 4, pp. 738–
vectors to the circuit. The same test vector to the same circuit 751, 2008.
may yield to the same excitation. However, thanks to different [4] C. Zhuo, G. Wilke, R. Chakraborty, A. A. Aydiner, S. Chakravarty, and
W.-K. Shih, “Silicon-validated power delivery modeling and analysis
power grid designs, the reported noise violations are different on a 32-nm ddr i/o interface,” IEEE Transactions on Very Large Scale
from design to design and scenario to scenario. The original Integration (VLSI) Systems, vol. 9, no. 23, pp. 1760–1771, 2015.
model is trained on D1/S1 with 96.46% accuracy. While the [5] Z. Xie, H. Ren, B. Khailany, Y. Sheng, S. Santosh, J. Hu, and Y. Chen,
“Powernet: Transferable dynamic ir drop estimation via maximum
model can maintain good accuracy across different scenarios convolutional neural network,” in 2020 25th Asia and South Pacific
(S1, S2, S3), the accuracy significantly drops when transferred Design Automation Conference (ASP-DAC), pp. 13–18, IEEE, 2020.
to the new power grid design (with the same circuit netlist). [6] I. N. Hajj, “Circuit theory in circuit simulation,” IEEE Circuits and
Systems Magazine, vol. 16, no. 2, pp. 6–10, 2016.
[7] Q. Chen, “A robust exponential integrator method for generic nonlin-
IV. D ISCUSSIONS AND C ONCLUSIONS ear circuit simulation,” in in Proceedings of the 57th Annual Design
In this paper, we discussed a few existing challenges of Automation Conference, pp. 1–6, IEEE, 2020.
[8] Z. Jin, T. Feng, Y. Duan, X. Wu, M. Cheng, Z. Zhou, and W. Liu,
deploying machine learning or deep learning models in back- “Palbbd: A parallel arclength method using bordered block diagonal
end simulation. While it is observed growing interest in deep form for dc analysis,” in Proceedings of the 2021 on Great Lakes
learning assisted back-end simulation, there are still many Symposium on VLSI, pp. 327–332, 2021.
[9] J. Zhao, Y. Wen, Y. Luo, Z. Jin, W. Liu, and Z. Zhou, “Sflu:
practical issues that need to be addressed, as discussed in the Synchronization-free sparse lu factorization for fast circuit simulation
last section. Even with all the above issues, the popularity on gpus,” in Proceedings of the 58th Annual Design Automation Con-
of deep learning still open up many opportunities to the ference, IEEE, pp. 37–42, 2021.
[10] Q. Zhang, S. Su, J. Liu, and M. S.-W. Chen, “Cepa: Cnn-based
back-end simulation area. For example, federated learning can early performance assertion scheme for analog and mixed-signal circuit
be a promising alternative to unify the data from different simulation,” in in Proceedings of the 39th International Conference on
design houses to resolve the issue of data insufficiency and Computer-Aided Design (ICCAD), pp. 1–9, IEEE, 2020.
[11] A. Mirhoseini, A. Goldie, M. Yazgan, J. W. Jiang, E. Songhori, S. Wang,
homogeneity while maintaining local data privacy [56]. While Y.-J. Lee, E. Johnson, O. Pathak, A. Nazi, et al., “A graph placement
many models demand very particular domain knowledge or methodology for fast chip design,” Nature, vol. 594, no. 7862, pp. 207–
heuristics to design the proper model architecture, GNN can 212, 2021.
[12] V. A. Chhabria, Y. Zhang, H. Ren, B. Keller, B. Khailany, and S. S.
be a natural option to properly model the circuit and input Sapatnekar, “Mavirec: Ml-aided vectored ir-drop estimation and classi-
stimulus, and then extract the intrinsic features of complex fication,” in 2021 Design, Automation & Test in Europe Conference &
circuits [57]. For the model generality, transfer learning has Exhibition (DATE), pp. 1825–1828, IEEE, 2021.
[13] C.-H. Pao, A.-Y. Su, and Y.-M. Lee, “Xgbir: an xgboost-based ir drop
been proposed to increase the generality, in which the features predictor for power delivery network,” in 2020 Design, Automation &
in shallow layers can be adapted to other tasks [58]. Test in Europe Conference & Exhibition (DATE), pp. 1307–1310, IEEE,
However, despite all the new techniques, it is noted that the 2020.
[14] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,”
back-end simulation is well formulated in mathematics. Very in Proceedings of the 22nd acm sigkdd international conference on
few machine learning works attempt to theoretically assist knowledge discovery and data mining, pp. 785–794, 2016.
[15] W. W. Xing, X. Jin, Y. Liu, D. Niu, W. Zhao, and Z. Jin, “Boa-pta, a [35] C. Zhao, Z. Zhou, and D. Wu, “Empyrean ALPS-GT: gpu-accelerated
bayesian optimization accelerated error-free spice solver,” arXiv preprint analog circuit simulation,” in IEEE/ACM International Conference On
arXiv:2108.00257, 2021. Computer Aided Design, (ICCAD), pp. 167:1–167:3, IEEE, 2020.
[16] Y. Zhou, H. Ren, Y. Zhang, B. Keller, B. Khailany, and Z. Zhang, [36] J. Li, M. Yue, Y. Zhao, and G. Lin, “Machine-learning-based online
“Primal: Power inference using machine learning,” in Proceedings of transient analysis via iterative computation of generator dynamics,” in
the 56th Annual Design Automation Conference 2019, pp. 1–6, 2019. 2020 IEEE International Conference on Communications, Control, and
[17] C.-T. Ho and A. B. Kahng, “Incpird: Fast learning-based prediction of Computing Technologies for Smart Grids (SmartGridComm), 2020.
incremental ir drop,” in 2019 IEEE/ACM International Conference on [37] T. Nakura, SPICE Simulation, pp. 19–47. Singapore: Springer, 2016.
Computer-Aided Design (ICCAD), pp. 1–8, IEEE, 2019. [38] S. Peng and S. X. Tan, “GLU3.0: fast gpu-based parallel sparse LU
[18] V. A. Chhabria, V. Ahuja, A. Prabhu, N. Patil, P. Jain, and S. S. factorization for circuit simulation,” IEEE Des. Test, vol. 37, no. 3,
Sapatnekar, “Thermal and ir drop analysis using convolutional encoder- pp. 78–90, 2020.
decoder networks,” in Proceedings of the 26th Asia and South Pacific [39] P. Benner, M. Hinze, and E. J. W. Ter Maten, Model reduction for circuit
Design Automation Conference, pp. 690–696, 2021. simulation, vol. 74. Springer, 2011.
[19] W. Yu, Z. Xu, B. Li, and C. Zhuo, “Floating random walk-based capaci- [40] F. M. Johannes, “Partitioning of vlsi circuits and systems,” in Proceed-
tance simulation considering general floating metals,” IEEE Transactions ings of the 33rd Annual Design Automation Conference, pp. 83–87,
on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, 1996.
no. 8, pp. 1711–1715, 2018. [41] Ü. V. Çatalyürek and C. Aykanat, “Patoh (partitioning tool for hy-
[20] M. Zhao, R. V. Panda, S. S. Sapatnekar, and D. Blaauw, “Hierarchi- pergraphs),” in Encyclopedia of Parallel Computing, pp. 1479–1487,
cal analysis of power distribution networks,” IEEE Transactions on Springer, 2011.
Computer-Aided Design of Integrated Circuits and Systems, vol. 21, [42] G. Karypis, “hmetis 1.5: A hypergraph partitioning package,”
no. 2, pp. 159–168, 2002. http://www. cs. umn. edu/˜ metis, 1998.
[21] Z. Zhu, B. Yao, and C.-K. Cheng, “Power network analysis using [43] F. bizzarri, A. Brambilla, and G. Storti-Gajani, “Fastspice circuit parti-
an adaptive algebraic multigrid approach,” in Proceedings of the 40th tioning to compute dc operating points preserving spice-like simulators
annual Design Automation Conference, pp. 105–108, 2003. accuracy,” Simulation Modelling Practice and Theory, vol. 81, pp. 51–
[22] H. Qian, S. R. Nassif, and S. S. Sapatnekar, “Power grid analysis 63, 2018.
using random walks,” IEEE Transactions on Computer-Aided Design [44] N. Zhu, “Partitioning in post-layout circuit simulation,” January 2019.
of Integrated Circuits and Systems, vol. 24, no. 8, pp. 1204–1224, 2005. [45] F. N. Najm, Circuit simulation. John Wiley & Sons, 2010.
[23] C. Zhang and P. Zhou, “Improved hierarchical ir drop analysis in [46] T. Najibi, “Continuation methods as applied to circuit simulation,” IEEE
homogeneous circuits,” in 2020 IEEE 15th International Conference on Circuits and Devices Magazine, vol. 5, no. 5, pp. 48–49, 1989.
Solid-State & Integrated Circuit Technology (ICSICT), pp. 1–3, IEEE, [47] Z. JIN, M. LIU, and X. WU, “An adaptive dynamic-element pta method
2020. for solving nonlinear dc operating point of transistor circuits,” in 2018
[24] C. Zhuo, K. Unda, Y. Shi, and W.-K. Shih, “From layout to system: IEEE 61st International Midwest Symposium on Circuits and Systems
Early stage power delivery and architecture co-exploration,” IEEE Trans- (MWSCAS), pp. 37–40, 2018.
actions on Computer-Aided Design of Integrated Circuits and Systems, [48] A. Ushida, Y. Yamagami, Y. Nishio, I. Kinouchi, and Y. Inoue, “An
vol. 38, no. 7, pp. 1291–1304, 2019. efficient algorithm for finding multiple dc solutions based on the spice-
[25] S. N. Mozaffari, B. Bhaskaran, K. Narayanun, A. Abdollahian, V. Pa- oriented newton homotopy method,” IEEE Trans. Comput. Aided Des.
galone, S. Sarangi, and J. E. Colburn, “An efficient supervised learning Integr. Circuits Syst., vol. 21, no. 3, pp. 337–348, 2002.
method to predict power supply noise during at-speed test,” in 2019 [49] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are
IEEE International Test Conference (ITC), pp. 1–10, IEEE, 2019. features in deep neural networks?,” Advances in Neural Information
[26] S.-Y. Lin, Y.-C. Fang, Y.-C. Li, Y.-C. Liu, T.-S. Yang, S.-C. Lin, C.-M. Processing Systems, vol. 27, pp. 3320–3328, 2014.
Li, and E. J.-W. Fang, “Ir drop prediction of eco-revised circuits using [50] Y. Zhang, H. Ren, and B. Khailany, “Grannite: Graph neural network
machine learning,” in 2018 IEEE 36th VLSI Test Symposium (VTS), inference for transferable power estimation,” in in Proceedings of the
pp. 1–6, IEEE, 2018. 57th Annual Design Automation Conference, IEEE, 2020.
[27] Y. Kwon, G. Jung, D. Hyun, and Y. Shin, “Dynamic ir drop prediction [51] W. Lee and R. Achar, “Gpu-accelerated adaptive PCBSO mode-based
using image-to-image translation neural network,” in 2021 IEEE Inter- hybrid RLA for sparse LU factorization in circuit simulation,” IEEE
national Symposium on Circuits and Systems (ISCAS), pp. 1–5, IEEE, Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 40, no. 11,
2021. pp. 2320–2330, 2021.
[28] Y.-C. Fang, H.-Y. Lin, M.-Y. Su, C.-M. Li, and E. J.-W. Fang, “Machine- [52] E. Dufrechou, P. Ezzatti, and E. S. Quintana-Ortı́, “Selecting optimal
learning-based dynamic ir drop prediction for eco,” in Proceedings of spmv realizations for gpus via machine learning,” Int. J. High Perform.
the International Conference on Computer-Aided Design, pp. 1–7, 2018. Comput. Appl., vol. 35, no. 3, 2021.
[29] Y. Li, C. Zhuo, and P. Zhou, “A cross-layer framework for temporal [53] H. Cui, S. Hirasawa, H. Kobayashi, and H. Takizawa, “A machine
power and supply noise prediction,” IEEE Transactions on Computer- learning-based approach for selecting spmv kernels and matrix storage
Aided Design of Integrated Circuits and Systems, vol. 38, no. 10, formats,” IEICE Trans. Inf. Syst., vol. 101-D, no. 9, pp. 2307–2314,
pp. 1914–1927, 2019. 2018.
[30] H. Zhou, W. Jin, and S. X.-D. Tan, “Gridnet: Fast data-driven em- [54] I. Nisa, C. Siegel, A. Sukumaran-Rajam, A. Vishnu, and P. Sadayappan,
induced ir drop prediction and localized fixing for on-chip power grid “Effective machine learning based format selection and performance
networks,” in 2020 IEEE/ACM International Conference On Computer modeling for spmv on gpus,” in IEEE International Parallel and
Aided Design (ICCAD), pp. 1–9, IEEE, 2020. Distributed Processing Symposium Workshops, pp. 1056–1065, 2018.
[31] V. A. Chhabria, A. B. Kahng, M. Kim, U. Mallappa, S. S. Sapatnekar, [55] R. Furuhata, M. Zhao, M. Agung, R. Egawa, and H. Takizawa, “Im-
and B. Xu, “Template-based pdn synthesis in floorplan and placement proving the accuracy in spmv implementation selection with machine
using classifier and cnn techniques,” in 2020 25th Asia and South Pacific learning,” in Eighth International Symposium on Computing and Net-
Design Automation Conference (ASP-DAC), pp. 44–49, IEEE, 2020. working Workshops (CANDAR), pp. 172–177, IEEE, 2020.
[32] S. Dey, S. Nandi, and G. Trivedi, “Powerplanningdl: Reliability-aware [56] X. Lin, J. Pan, J. Xu, Y. Chen, and C. Zhuo, “Lithography hotspot
framework for on-chip power grid design using deep learning,” in 2020 detection via heterogeneous federated learning with local adaptation,”
Design, Automation & Test in Europe Conference & Exhibition (DATE), arXiv preprint arXiv:2107.04367, 2021.
pp. 1520–1525, IEEE, 2020. [57] Y. Ma, H. Ren, B. Khailany, H. Sikka, L. Luo, K. Natarajan, and B. Yu,
[33] X.-X. Huang, H.-C. Chen, S.-W. Wang, I. H.-R. Jiang, Y.-C. Chou, “High performance graph convolutional networks with applications in
and C.-H. Tsai, “Dynamic ir-drop eco optimization by cell movement testability analysis,” in Proceedings of the 56th Annual Design Automa-
with current waveform staggering and machine learning guidance,” in tion Conference 2019, pp. 1–6, 2019.
Proceedings of the 39th International Conference on Computer-Aided [58] C. Yu and W. Zhou, “Decision making in synthesis cross technolo-
Design, pp. 1–9, 2020. gies using lstms and transfer learning,” in Proceedings of the 2020
[34] H.-Y. Lin, Y.-C. Fang, S.-T. Liu, J.-X. Chen, C.-M. Li, and E. J.-W. ACM/IEEE Workshop on Machine Learning for CAD, MLCAD ’20,
Fang, “Automatic ir-drop eco using machine learning,” in 2020 IEEE pp. 55–60, Association for Computing Machinery, 2020.
International Test Conference in Asia (ITC-Asia), pp. 7–12, IEEE, 2020.