Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

ASSIGNMENT NUMBER – 2 ( SOC)

ANKIT RAJ

ROLL NO 94

Q.1) Explain the fundamental trends of SOC design.


“Digital convergence” is creating demand for functionally complex ICs in six-nine month design cycles at mass-market
costs. On the one hand, increased capital investments in the late 1990s have seen worldwide IC manufacturing
capabilities growing by over 50 percent per year. On the other hand, as the electronics industry shifts from ASICs to
SoCs, design productivity has struggled to grow at 20 percent per year, and the gap is widening. The industry slowdown,
one of the worst recessions in the history of the semiconductor industry, is only serving to accelerate the inflexion point
as ASICs for struggling PC, server and industrial broadband markets lose out to SoCs or “Systems-on-a-few Chips.”
These chips power next-generation consumer-oriented hand-held computing, multimedia and Internet-enabled
communications products based on digital convergence.

As a result, it has never been more important to efficiently manufacture high volume products, while meeting tight
time-to-market deadlines, while retaining flexibility to morph products quickly to meet the changing needs of
increasingly fickle consumers. Further compounding the situation: Most existing design methodologies are struggling
to cope with today's challenges and are in need of an overhaul.

Today's SoC-based designs present a design team with five facets of complexity that range from functional complexity,
architectural and verification challenges to design team communication and deep submicron (DSM) implementation.

Facet one is functional content complexity and the sheer amount of functional blocks contained in a typical system.
This breadth of content leads to the wide acceptance that designing these systems from scratch is far beyond the
design productivity or capabilities of even the largest design teams. Some form of intellectual property (IP) reuse has
become an inevitable part of SoC design.

Much of the functionality in a SoC-based system is implemented as software, and the dual needs for increased
productivity and reuse exist for software as well as for hardware. Along with increased IP reuse, this leads to a need to
design at higher levels of abstraction and to move from written specifications to “executable” specifications, often in
the C or C++ languages.

The second facet is the architectural challenge. When functional complexity of this scale is implemented in a short
timeframe, there must be a detailed architecture set from the start of the project and rigorously adhered to throughout
implementation.

Gone are the days when various blocks of functionality could be distributed among design team members who had
relatively complete autonomy for the architecture of that block. Of course, there has always been some unifying
architectural guidelines -processor speed, bus width for necessary throughput and the likely amount of memory, for
example. This information is typically based on guru knowledge and experience of a small number of senior system
architects.
These methods are under pressure because of increased design complexity and from the necessity to get the
architecture right the first time. It is increasingly unacceptable, in terms of both time and money, to change the
architecture mid-way, and equally unacceptable to “over-engineer” from a unit cost perspective.

These first two facets combine together to make the System Level Design (SLD) portion of the SoC design process.

The verification challenge is the third facet and stems from the first —it is impossible to create broad functional content
and increasingly difficult to integrate and verify it all. The verification challenge is believed to scale with somewhere
between the square and the cube of the complexity of the block being verified.

This problem is exacerbated by two factors. IP reuse does not solve the problem since IP must still be verified within
the system context. And, hardware-software integration is left unacceptably late in the design cycle and becomes
prohibitively difficult since there is often no good prototype hardware available until the SoC is fabricated. This is
leading design teams to seek out many different kinds of verification improvements.

Where integration begins

One is the trend to perform system integration on a “virtual” prototype, which uses a simulation model to bring
hardware and software together at the earliest stage possible. The challenge is to define just where software
development stops and system integration begins. In practice, this is a large gray area that encompasses much of the
firmware development, particularly the development of the Hardware Abstraction Layer (HAL), the hardware-
dependent software layer that “abstracts” the application from its hardware “target.”

This leads to the fourth facet, and perhaps the biggest obstacle: the complexity of design team interactions and
communications necessary to successfully undertake a SoC-based design. To achieve improvement in the first three
facets, there needs to be interaction between system architects, algorithm designers, software developers, hardware
designers, system integrators and verification specialists. The reality is that this is a rarity —these organizational
functions are rarely integrated together. More often, there are significant organizational, and sometimes physical,
separation between these functions and real cooperation is the exception, not the norm.
The fifth facet is the set of issues that come with implementing a complex chip design in a DSM process technology.
These include problems of timing closure, placement and routing, including avoidance of increasingly problematic
physical effects such as crosstalk. While important, these issues are not the focus of the proposed design methodology.
Instead, they are covered by electronic design automation (EDA) companies whose products are focused on the RTL to
GDSII flow, which is fed by the SLD section of the SoC Design Process.

Engineers at CoWare are advocating a design methodology that is flexible and expressive, and represents the diverse
nature of system-level design for electronic products. It would be used by system architects, algorithm designers,
software developers, hardware designers, system integrators and verification specialists alike and should be viewed as
a single design methodology with multiple entry and exit points. It would offer them the ability to include IP or other
design elements from various sources. The main flow variants should be top-down design or platform-based design,
which involves a large degree of bottom-up design, at least in the platform creation stage.

In practice, real design flows are invariably a combination — neither purely top-down nor bottom-up. The combination
depends largely on the amount of available legacy design in existence.

The design methodology is most easily understood by considering a top-down approach. Steps in the methodology
involve specification, architecture and implementation, with different levels of verification proceeding in parallel.

The functional system is captured as a series of blocks that, once connected together, can be simulated allowing the
functional behavior of the system to be verified, debugged and analyzed. Once completed, partitioning can take place,
by designating which blocks in the executable specification will run as software on the target processor, which will be
hardware, and which form the external system model or testbench. Such a design flow will automatically generate all
the necessary glue logic and software drivers to implement the architecture implied by the chosen hardware-software
partition.

Both the software and hardware blocks can be further refined for implementation, and can be co-simulated at every
stage of refinement. A full range of debug and analysis tools appropriate for system architects, hardware designers
and software developers should be available throughout. Consideration should be made for analysis tools that can be
provided for exploring architectural issues like the hardware-software partition, processor choice, bus architecture and
so forth.

Following refinement, software can be exported as a complete program, compiled directly onto the target processor
and verified in the system. Hardware could be exported as synthesizable RTL in hardware description languages VHDL
or Verilog.

A design methodology that considers both hardware and software should be flexible, with many different paths and
entry/exit points possible, allowing a unified and consistent design approach between systems, hardware and software
developers. It should include two main flow variants, either top-down design or platform-based design.

The move toward implementing a flexible and expressive design methodology is a direction electronics companies
should consider to beat today's SoC design challenges.

Q.2) What are the drawbacks faced with traditional processor for SOC?

Most electronic devices today are architected through a System-on-Chip (SoC) design paradigm: the idea is to develop
a system through integration of pre-designed hardware and software blocks, often collectively referred to as design
Intellectual Properties (IPs for short). In current industrial setting, IPs are typically developed independently, either in-
house or by third-party vendors. An SoC integration team collects and assimilates these IPs based on the system
requirement for the target device. To enable smooth integration of the IPs into the target system, they are designed to
communicate with each other through well-defined interfaces, e.g., ARM R provides the AMBA bus interface that
includes on-chip interconnect specification for the connection and management of various functional blocks. In the
context of SoC designs, verification involves two somewhat independent verification flows, one for ensuring correct
operation of the IPs (and their adherence with the interface protocols) and another for the assembled system. Given
the complexity of modern computing devices, both requiring careful upfront planning, and span almost the entirety of
the design life-cycle. In this section, we give an overview of the various components of verification in current industrial
practice, as shown in Figure 1. Obviously, the notion of “industrial practice” is somewhat of a misnomer, since it varies
from company to company based on business targets, product needs, and even legacy practices. Nevertheless, the
following description captures the basic essence of SoC design verification flow and is relatively general. Verification
Planning. This activity starts about the same time as the product planning, and continues through the system
development phase. Product planning requires definition of the various IPs necessary, their decomposition into
hardware and software components, the connection and communication interfaces, and various power, performance,
security, and energy targets. Correspondingly, verification planning includes creation of appropriate test plans, test
cards, definition of specialized design blocks called Verification IP (VIP for short), instrumentation in the design for post-
silicon debug, definition of various monitors, checker, exercisers, etc. Architecture Verification and Prototype
Definitions. One of the first stages in the definition of an SoC design is the system architecture, which defines various
functional parameters of the design, communication protocols among IPs, power and performance management
schemes, etc. The parameters and design features explored at this stage include cache size, pipeline depth, protocols
definitions for power management and security, etc. The exploration is performed through a variety of “architectural
models”, which simulate typical workloads and target use cases of the device, and identify parameter values that
satisfy the device targets (e.g., power, performance, security, etc.) identified in the planning stage. There are two
important verification activities during this architectural exploration stage. The first is the functional verification of the
various communication protocols. This activity allows detection of high-level protocol errors at an early stage when
the design models are abstract and consequently simple, and the design is relatively less mature; such errors, if they
escape into the product implementation, can become extremely expensive, since a fix at that stage might require major
redesign of multiple IPs. Given the high abstraction of design models at this stage, it is feasible to perform formal
analysis to achieve this [35]; in current practice, formal methods are augmented with high-level simulation to provide
the desired coverage. The second crucial role for verification is to initiate development of hardware prototyping models
for subsequent needs in software and firmware verification. To understand this need, note that low-level software and
firmware programs need to be validated for correctness when operating on the target (and evolving) hardware design
developed during the implementation phase (see below). Clearly, one cannot wait for the hardware implementation to
be stabilized before initiating software/firmware verification. Consequently, high-level software models of the
hardware, also referred to as virtual prototype models, are developed to enable accelerated software/firmware
verification. These models are typically at the same abstraction level as the architecture models (and sometimes
derived from the latter), but they are different and serve a different purpose. Unlike architectural models, prototype
models are written to provide a hardware abstraction that nevertheless exercises various software corner cases. One
key requirement from the above is that the prototype model must include definition (and abstract functionality) of all
the software-visible interface registers of the various IPs. Development of prototype models is initiated concurrently
with architectural models, and it continues into the RTL development time-frame. The models are usually coordinated
with various “drops” or releases, each containing functionality at various degrees of maturity; these drops are
coordinated and synchronized carefully within the time-frame of software validation targets. Pre-silicon Verification.
This is the major resource-intense verification activity that takes place during (and after) hardware development and
implementation. Note that this is a continuous process, with increasing level of maturity and complexity as the design
matures. Most industrial SoC designs include a combination of legacy and new IPs, some created in-house and some
collected from third-party IP providers. An IP verification team (whether in-house or third-party) perform the
verification of the IP being delivered. This is done in a standalone environment, i.e., the objective is to ensure that the
IP on its own functions as expected. Subsequently, the SoC team would integrate the IPs into an (evolving) SoC model
and perform system-level verification; the target of the system-level verification is to ensure that the IPs function
correctly together as an integrated system. IPs are delivered to the SoC integration team either as hard IP, i.e.,
formatted as a physical design layout, or as soft IP, in the form of an RTL or design netlist. The amount of verification
performed by the IP team depends on the form in which the IP is delivered (e.g., a hard IP includes significantly higher
verification requirement than soft IP). Traditionally, IP verification has entailed exercising (and ensuring correctness of)
the IP design in a standalone environment. This permits a company to have a robust portfolio of generic IP designs that
can be quickly integrated into various SoC design products. With this view, an IP verification team develops such a
standalone verification infrastructure for the target IP. For simulation, this infrastructure includes testbench and
environment definitions that capture the target use cases of the IP design; for formal verification, it may include
environmental assumptions, target assertions etc. More recently, there has been a strong push to avoid “over-
validation”, i.e., to validate an SoC design for only its target use cases (see below). This has an impact on IP validation,
e.g., one has to define the use cases for the IP 3 corresponding to the SoC use cases. Once such (verified) IPs are
delivered to the SoC integration verification team they can then target system-level scenarios. Note that each use case
requires communication among multiple IPs; this is why it is so important in planning to carefully define IP drops to
enable cohesive system-level SoC verification. Most SoC integration verification includes system-level simulation and
definition of various use cases. However, note that many use cases require co-execution of hardware and software
modules. These are obviously difficult to exercise in simulation, since running software on RTL modules is slow and
often infeasible; such use cases are generally deferred until the design is mature for emulation and FPGA prototyping
(see below). Emulation and FPGA Prototyping. Technically, verification using emulation and FPGA prototyping is simply
a part of pre-silicon verification, since they are performed before the system goes into fabrication. However, in practice,
they form an important bridge between pre-silicon and post-silicon verification. Here one maps the RTL model of the
hardware into a reconfigurable architecture such as FPGA, or specialized accelerators and emulators [5], [8], [10];
these models run about hundreds to thousands times faster than an RTL simulator; consequently, one can execute
hardware/software use case scenarios such as an operating system boot in a few hours. This speed is obtained at the
cost of controllability and observability. In a simulator, one can observe any internal signal of the design at any time.
In contrast, in FPGA prototyping (which is the fastest of the pre-silicon platforms) the observability is restricted to a
few thousands of internal signals. Furthermore, one must decide on the signals to be observed before generating the
FPGA bit-stream. Reconfiguring the observability would require re-compilation of the bit-stream which might take
several hours. Consequently, they are used only when the design is quite mature, e.g., when the functionality is
relatively stable and debug observability fixed enough to warrant few recompilations. Recent innovations with in FPGA
technology [2], [3] address some of the observability limitations in FPGA solutions. Nevertheless, observability and
recompilation cost remain a challenge. Post-silicon Verification. Post-silicon validation is the activity where one uses
an actual silicon artifact instead of an RTL model. To enable post-silicon validation, early silicon is typically brought into
a debug lab, where various tests are run to validate functionality, timing, power, performance, electrical
characteristics, physical stress effects, etc. It is the last validation gate, which must be passed before mass production
can be initiated. Post-silicon validation is a highly complex activity, with its own significant planning, exploration, and
execution methodologies. A fuller discussion of post-silicon validation, as well as the specific challenges therein, is out
of scope for this paper, and the reader can refer to a previous paper [29] for a fuller discussion. From a functional
perspective, the fact that a test can run at a target clock speed enables execution of long use cases (e.g., booting an
operating system within seconds, exercising various power management and security features). On the other hand, it
is considerably more complex to control or observe the execution of silicon than that of an RTL simulation model (or
even FPGA or emulation models). Furthermore, changing observability in silicon is obviously infeasible. III.
VERIFICATION CHALLENGES: TRADITIONAL AND EMERGING In spite of maturity, verification tools today do not scale
up to the needs of modern SoC verification problems. In this section, we discuss some of the key challenges. While some
of the challenges are driven by complexity (e.g., tool scalability, particularly for formal), some are driven by the needs
of the rapidly changing technology trends. Shrinking Verification Time. The exponential growth in devices engendered
by the IoT regime has resulted in a shrinkage in the system development life-cycle, leaving little time for customized
verification efforts. However, each device has a different use case requirement, with associated functionality,
performance, energy, and security constraints. We are consequently faced with the conundrum of requiring to create
standardized, reusable verification flows and methodologies that can be easily adapted to a diversity of electronic
devices each with its unique tailor-made constraints. Two orthogonal approaches have so far been taken to address
this problem. The first is to improve tool scalability with the goal of eventually turning verification into a turn-key
solution; achieving this goal, however, remains elusive (see below). The other approach entails making the systems
themselves highly configurable, so that the same design may be “patched” to perform various use cases either through
software or firmware update or through hardware reconfiguration. Unfortunately, developing such configurable
designs also has a downside. Aside of the fact that it is impossible to determine all the different use cases of a hardware
system in advance (and hence identify whether enough configurability has been built in), this approach also
significantly blows up the number of states of the system and consequently makes their verification more challenging.
Limited Tool Scalability. Scalability remains a crucial problem in effective application of verification technology. The
problem is felt particularly acutely in formal verification; in spite of significant recent advances in automated formal
technologies such as Satisfiability (SAT) checking and Satisfiability Modulo Theories (SMT) [12], the chasm between the
scale and complexity of modern SoC designs and those which can be handled by formal technology has continued to
grow. The increasing requirements for configurability and consequent increase in design complexity have only served
to exacerbate the situation. To address this problem, there has been a growing trend in formal methods to target
specific applications (e.g., security, deadlock, etc.) rather than a complete proof of functional correctness. We will
discuss some of these applications in Section IV-C. The cost of simulation-based verification is also getting increasingly
prohibitive as the design size continues to increase. For instance, random simulation at the SoC level can cover only a
tiny portion of the design space. On the other hand, 4 directed tests designed for specific coverage goals can be
prohibitive in terms of human effort required. Specification Capture. A key challenge in the applicability of verification
today is the lack of specifications. Traditionally specifications have largely relied on requirements documents, which
under-specified or omitted design behavior for some scenarios or left some cases vague and ambiguous. Such
omissions and ambiguity, while sometimes intentional, were often due to the ambiguity inherent in natural languages.
Unfortunately, the problem becomes significantly more contentious in the context of modern SoC designs than for
traditional microprocessors. Recall that at least in the realm of microprocessors there is a natural abstraction of the
hardware defined by the instruction-set architecture (ISA). Although the semantics of ISA are complex (and typically
described in ambiguous English manuals spanning thousands of pages), the very fact of their standardization and
stability across product generation enables concretization and general understanding of their intended behavior. For
example, most microprocessor development companies have a detailed simulator for the microprocessor ISA, which
can serve as an executable golden reference. On the other hand, it is much harder to characterize the intended behavior
of an SoC design. Indeed, SoC design requirements span across multiple documents (often contradictory) that consider
the intended behavior from a variety of directions, e.g., there are system-level requirements documents, integration
documents, high-level-architecture documents, microarchitecture documents, as well as cross-cutting documents for
system-level power management, security, post-silicon validation, etc. [30]. Merely reconciling the descriptions from
the different documents is a highly complex activity, let alone defining properties and assertions as necessary for
verification. Use Case Identification. Given the aggressive time-to-market requirements, there has been a general move
in verification today away from comprehensive coverage of the whole system (or a system component) and towards
more narrowly defined coverage of intended usage scenarios. For example, for a device intended primarily for low-
power and low-performance application (e.g., a small wearable device), the intended usage would include scenarios
where different components transition frequently into various sleep modes but would not include sustained execution
at high clock speeds; conversely, a highperformance device such as a gaming system would prioritize execution at high
clock speeds. In general, the exploration and planning phases of the device life-cycle define a set of use cases which
constitute the target usages of the device and must be exercised during verification. Unfortunately, this approach,
while attempting to reduce verification effort by eliminating “over-validation” might induce significant complexity in
the process. In particular, the usage scenarios are typically defined at a level of the device and involve complex
interaction of hardware, firmware, and software; it is nontrivial to determine from such high-level verification targets
how to define verification goals for individual IPs, or even hardware blocks for the entire SoC. Furthermore, the SoC
design itself and individual IPs have orthogonal verification needs, together with their own methodologies, flows, and
timelines. For example, an USB controller IP is targeted to be developed (and verified) to be usable across the slew of
USB devices; a smartphone making use of this IP, on the other hand, must be verified for the usage scenarios which
are relevant for the smartphone. Finally, exercising the devicelevel use cases requires hardware, firmware, and
software at a reasonable maturity, which is available only late in the system life-cycle (e.g., either at post-silicon or at
least during emulation of FPGA prototyping). Bugs found this late may be expensive to fix and may involve considerable
design churn. Power Management Challenges. Low power requirements for integrated circuits and power efficiency
has been a main focus for todays complex SoC designs. Power gating and clock gating have been the most effective
and widely used approaches for power reduction. Power gating relies on shutting off the blocks or transistors that are
not used. Clock gating shuts off blocks or registers that are not required to be active. Industrial standards have been
developed to describe the power intent of low power designs to support the simulation of power aspects at RTL
simulation. However, these features significantly complicate verification activities. One reason is the obvious
multiplication of complexity. It is not uncommon that a low power design can feature tens of power domains and thus
hundreds of power modes. It is prohibitive to verify (whether through simulation or through formal methods) that the
design is functional under all possible power modes. In practice, verification focuses on SoC use case scenarios, which
are driven by hypervisor/OS control and applicationlevel power management. This requires hardware/software
coverification of the power management features. A second — perhaps more subtle — challenge involves its
interaction with post-silicon verification. The behavior within a power-gated IP cannot be observed during silicon
execution; this implies that it is very difficult to validate design behaviors as various IPs get in and out of different sleep
states. Unfortunately, these are exactly states which account for subtle corner-case bugs, making validation
challenging. To make matters worse, powergated IPs may make it difficult to observe behavior of other IPs that are
not in sleep states. Consider an IP A with an observable signal s. In order for s to be observable, its value must be routed
to an observation point such as an output pin or system memory. If this route includes another IP B then we may not
be able to observe the value of s whenever B is power-gated even if A is active at that time. Security and Functional
Safety. Security and privacy have become critical requirements for electronic devices in the modern era. Unfortunately,
these are often poorly specified, and even poorly understood. One reason is that with the new IoT era, devices are
getting connected which were never originally intended to be connected, e.g., refrigerators, light bulbs, or even
automobiles. Consequently, security threats and mitigation remain unclear and one typically resorts to experts
performing “hackathons”, i.e., directed targeted hacking of the device, to identify security threats. In addition to
security, functional safety, i.e., the assurance that the device does not harm anything in the environment due to system
failure, is a critical requirement for electronic devices used in applications 5 such as aerospace and automotive. Safety
mechanisms must be implemented for such devices to ensure that the device can be functional under the circumstances
of unexpected errors. For example, lockstep systems are fault-tolerant systems commonly used in automotive devices
that run safety critical operations in parallel. It allows error detection and error correction: the output from lockstep
operations can be compared to determine if there has been a fault if there are at least two systems (dual modular
redundancy), and the error can be automatically corrected if there are at least three systems (triple modular
redundancy), via majority vote. Safety critical devices must be compliant to IEC 61508 [13] and ISO 26262 [23] is
particularly designed for automotive electronics. Hardware/Software Co-verification. In the days of microprocessors
and application software, it was easy to separate concerns between hardware and software verification activities.
However, today, with an increasing trend of defining critical functionality in software, it is difficult to make the
distinction. Indeed, it may not be possible in many cases to define a coherent specification (or intended behavior) of
the hardware without the associated firmware or software running. This has several consequences for verification. In
particular, hardware and software are traditionally developed independently; the tight coupling of the two makes it
incumbent that we define software milestones to closely correspond to (and be consistent with) various RTL drops.
Furthermore, validating software requires an underlying hardware model that is stable, mature, and fast. An RTL model
and simulation environment does not have any of these characteristics. On the other hand, waiting for the maturity of
emulation or silicon may be too late for identifying critical errors. Analog and Mixed Signal Components: Most IoT
devices include various sensors and actuators in addition to their digital processing core. Hence the environment in
which these devices operate is inherently analog. As a result, an increasingly large portion of the die area is occupied
by analog/mixed-signal (AMS) circuits [27]. Due to the complex nature of analog behavior, design and verification
methodologies of analog circuits are far more primitive compared with that of digital circuits. Verification of analog
and mixed-signal ICs remains complex, expensive and often a ”one-off” task. Complicating the problem is the
requirement of combining both methodologies to ensure thorough verification of comprehensive aspects of the mixed-
signal SoCs.

Q.3) Explain traditional SOC methodology

Deep sub-micron effects complicate design closure for very large designs. Top-down hierarchical design methodology
combined with physical prototyping increases design productivity and restores schedule predictability. In this paper a
top-down hierarchical flow will be discussed and use of physical prototyping to predict the performance and physical
characteristics of the final physical implementation will be explained.

TOP-DOWN SOC DESIGN METHODOLOGY

System-on-Chip (SoC) designs have become one of the main drivers of the semiconductor technology in recent years.
Multi-million gate designs with multiple third party intellectual property (IP) cores are commonplace. SoC designers
employ IP reuse to improve design productivity. Previous designs done in-house or third party designs can be used as
IP in the current design. While employing IP cuts development costs and time, integration complexity increases.
This is one of the main reasons why SoC designs are implemented with hierarchical top-down design flows. These flows
help to manage the different and conflicting requirements of increasing design size, deep-sub micron effects (DSM)
and the necessity for shorter and predictable implementation times. Hierarchical methodologies allow multiple teams
to work on different parts of the design concurrently and independently. This "divide and conquer" approach reduces
the complexity of the design problem for each design team and reduces the time to market. For the SoC designs, which
are built from independent function blocks, these capabilities are key advantages as the final implementation of
complex chips can be a lengthy process and parallelization can save valuable time. Hierarchical design styles also allow
for much faster and easier late ECO's. Functional changes may be localized to a single block leaving the remainder of
the design unaffected. This localization results in faster, easier ECO's.
Another reason for hierarchy is to overcome the capacity limitations of design tools. Hierarchical design flows are
scalable to handle designs containing upwards of 100 million gates.
In addition to the complexities that are a result of large design size, deep sub-micron effects add to integration
complexities and cause late stage surprises and large loops during the design cycle.
In deep sub-micron technologies, wires, power, routability and manufacturability have to be considered early in the
design cycle. Physical prototyping provides early feedback in terms of design closure and helps validate the correctness
of design decisions. Physical prototyping should accurately predict the characteristics of the final physical
implementation. This can be accomplished by performing cell placement and global routing at an appropriate level of
granularity needed to ensure that the prototype correlates to the final implementation within a specified tolerance.

Traditional, top-down SoC designs rely on the assumption that the budgeting performed at the chip-level need not be
revised after the blocks are implemented. However, unless very conservative budgets are used, it is impossible to
predict upfront whether the final block implementations will meet all constraints. Also, it is difficult to adjust the
budgeting if we cannot capture the physical properties (e.g., driver strength, parasitics, current drain, etc) that are
observed at the block and chip boundaries. A top-down hierarchical design methodology should therefore be combined
with physical prototyping to enhance design productivity and restore schedule predictability . In this paper, a top-down
hierarchical block-based flow will be discussed and use of physical prototyping to predict the performance and physical
characteristics of the final physical implementation will be explained.
HIERARCHICAL SOC DESIGN FLOW

The components of a predictable top-down hierarchical flow are design planning, physical prototyping, and
implementation. At the design planning stage, chip topography, area, number of chip level partitions and timing
budgets are determined. During physical prototyping, the design planning results are validated for each block and for
the top-level. If necessary, corrective action is taken by going back to design planning and progressively refining the
design. Once physical prototyping results are satisfactory, implementation can commence concurrently for each block
and for the top-level, with the assurance that design-planning decisions are correct and implementation will be
completed without any late surprises. Top-down planning and bottom-up prototyping is the most predictable way to
achieve closure on large SoC designs.

Design planning constitutes an important portion of the top-down hierarchical design flow. The SoC designer evaluates
tradeoffs with respect to timing, area, and power during design planning. At this stage, various IP cores from different
vendors are integrated into the design along with custom logic. The IP may be provided as RTL code, gate level netlists,
or fully implemented hard macros. Decisions regarding choices of different implementations of the same IP, chip and
block aspect ratio, budgeting of top-level constraints, standard cell utilization, and other design aspects are made
during design planning .
Design planning functions include partitioning of the design, block placement and shaping, hard macro placement, pin
assignment and optimization, top level route planning, top level repeater insertion, block budget generation, and
power routing. All of these functions are closely linked to the underlying physics of DSM technology. For example, top-
level repeater insertion cannot be done properly without considering signal integrity and pins cannot be assigned
without considering antenna rules.

Design planning can start upon availability of the initial top-level netlist, even if the modules have no internal definition
or structure. At this stage missing modules are represented as black boxes. The areas of black boxes are user defined
and quick timing models are generated for setup/hold arcs and clock-to-output delays. Area estimates for modules
that have already been synthesized will be determined by the gate count and user defined utilization.

Once the design is read in, and block sizes are determined, an initial floorplan is created by automatically placing all
blocks, shaping the soft blocks, and packing the blocks together based on global routing information. Using the block
placement results, adjacent blocks may be clustered together, or very large blocks may be divided into smaller blocks.
Modifications of the physical hierarchy at this stage may be made to take full advantage of the physical implementation
tools, and to minimize the number of top-level blocks.

The block placer must also be able to automatically perform such operations as determine the best aspect ratios for
soft blocks and choose the best among different equivalent implementations of hard blocs. A combination of the block
placer with a memory or macro generator leads to optimized SoC blocks as the design planner finds a global optimum
between the different possible implementations and the chip plan. After initial block placement, top-down pin
assignment is performed; top-level connectivity and timing drive the placement of the pins on the blocks. For RTL or
black box modules, pin assignment will help to create block-level constraints. Once the physical locations of pins are
known, top-level net lengths can be estimated.

For each block, an internal design plan is created. Macro placement is driven by both top-down pin assignments that
were done in the previous step and internal metrics such as connectivity, timing and area. Once the internal planning
for all blocks has been completed, power route planning is done. Most recent technologies require a mesh structure.
The power routing grid and block placement grid should be carefully set to prevent connectivity problems that may
arise due to misalignment of a block with respect to power grid.
After power routing, pin assignments are refined using global routing results. The global router can identify narrow or
wide channels and move blocks around to open up congested channels and constrict sparse ones. This enables
optimum pin placement for routability during the implementation stage.

Another complexity facing SoC designers during design planning is top-level route planning. Nets between critical
blocks must be as short as possible and should often be routed over other blocks. These over-the-block nets should be
pushed down into the blocks automatically. This requires that a number of operations take place. Pins must be assigned
to the block to accommodate this new feedthrough net. Both the top-level and internal block-level netlists must be
altered to add connectivity to the feedthrough net. Top-level timing budgets must be adjusted and internal block-level
budgets must be generated to account for global timing closure and signal integrity. The use of routing over blocks
may even include reserving special routing channels and empty placement areas for repeaters. Altering blocks in this
way conflicts with the goal of having separated, or even re-usable SoC blocks, so it depends on the overall project goals
to what extent such techniques are used. If turn-around time (TAT) or re-use are the primary goals, such techniques
should used very carefully. If smallest die size or best design performance are primary goals, then the use of
feedthroughs may be essential to achieving the goals.
During timing budgeting, delay of top-level nets should be calculated with the assumption that buffers will be added
to long or high fan-out nets as needed. Block budgets will be used as constraints to drive synthesis, prototyping, and
implementation of the blocks.

In practice, planning may begin before all of the blocks are fully implemented, so rough estimates are initially used
instead. As the blocks progressively gain definition, it is necessary to relay the new block information back up to the
chip-level, where it is incrementally updated and the appropriate adjustments are made. This may trigger changes at
the chip level that must be pushed back down to the block level. This leads to a top-down budgeting, bottom-up
prototyping flow, which is more predictable and better suited to handle variances between block-level constraints and
actual implementation.
Although it may appear that there is a conflict between early design planning using black-box models or RTL and netlist-
based design planning this is not the case; these activities actually complement each other. Early top-down design
planning is an important step to drive RTL synthesis and to generate a gate-level netlist that is used to further refine
the design plan.

A characteristic of the continuous planning and optimization process is the use of different types of models that are
optimized for the different operations in the process. This is illustrated in the figure above. Simple block models are
used for design planning and budgeting. The physical prototypes of the blocks are built based upon the budgets from
the design plan. The physical prototypes provide valuable physical information about the final implementation of the
blocks. They will be described in the next chapter. The physical prototypes are then used to replace the black boxes and
RTL modules at the top level, so that we can refine the chip-level constraints. When the final budgeting is resolved, we
return to the blocks and resume their implementation, and then we finish with the top-level chip assembly.

Also, different types of models can be mixed at the top level since it is likely that all prototypes will not be completed
at exactly the same time. This enables early verification and adjustment of the chip-level constraints using a
combination of black boxes or RTL for some blocks, accurate prototypes for others, and even completed physical
layouts for some of the blocks.

Physical prototyping is an important stage of the hierarchical design flow as it provides more details about the block
implementation to the SoC designer. It bridges the gap between logical and physical design by adding physical reality
to the abstract view of the design planning process. During physical prototyping, logic optimization and global
placement are concurrently applied. At this stage, design-planning results are validated for each block and for the top-
level, and all conflicts are resolved. The prototypes uncover the problems; the corrective action is taken in the design
planning stage. Incomplete timing constraints can be discovered and addressed with the availability of accurate
physical information.
Physical prototyping is inseparably connected with the physical synthesis process that addresses many DSM issues by
combining elements of logic synthesis and physical implementation together into a single stage . Physical synthesis, as
most people use it today, starts with a gate-level netlist and performs logic optimization, placement and global routing,
to produce a placed design that meets timing requirements. Physical synthesis may employ numerous techniques to
optimize the logical structure of the chip including: gate sizing, buffering, pin swapping, gate cloning, useful skew, re-
synthesis and technology re-mapping, redundancy-based optimization, and area and power recovery.

This is a significant improvement over pure logic synthesis because the logic optimization is performed and evaluated
based on cell placement that is indicative of the final placement.

It is significant to note that it no longer makes sense for RTL-to-gate synthesis tools to perform sophisticated gate-level
optimization. Without accurate physical information, logic synthesis tools cannot make good decisions about cell sizing
or buffering. Physical synthesis is much better suited for these tasks. Today, the role of RTL-to-gate logic synthesis has
been reduced to simply producing a structural gate-level netlist as quickly as possible, and then pass it along to physical
synthesis without attempting to optimize the sizing or buffering aspects. This has consequences for IP cores, which are
delivered as soft macros from the IP vendor to the user or implementer. The IP provider delivers either the final hard
macro or an RTL/netlist and implementation constraints to allow the optimization of the IP during the implementation
of the SoC chip.
All the information generated during the physical prototyping of blocks plays a key role in feeding back more accurate
information to the design planning stage for refinement of top-level design parameters.
The physical prototype consists of a coarse placement and optimized netlist. Power routing, clock tree buffers, high
fan-out net buffering must be included in the physical prototype. Without any of these items, physical prototype will
not correlate to implementation and will not give useful results.

To create the physical prototype, a hierarchical tree of cell-clusters is built from the original netlist before the
placement starts. While building the tree, functional hierarchy and connectivity are considered. Then, the block area is
divided into placement bins, and the cell-clusters are assigned to bins among hard macros. The congestion is modeled
using wires crossing bin boundaries. During the early stages, the bins are very coarse and it is not useful to measure
timing since most of the wire capacitance is due to intra-bin nets and can only be statistically estimated. As placement
progresses, the block area is further divided into smaller bins, and placement is refined, to improve both congestion
and wirelength. The bins continue to get progressively smaller in size until at some point, the global wires can be
accurately estimated, and intra-bin wire uncertainty is negligible . Physical synthesis can now start and the netlist is
transformed to meet timing constraints. The placement is not yet finalized, hence, the impact of netlist optimization
operations such as long net buffering, sizing, fan-out optimization, technology re-mapping, etc., can be easily
absorbed. The picture below shows this design process.
Similarly, clock tree synthesis can be done at the physical prototyping stage assuming the leaf instances are placed at
the center of the bins. Congestion and utilization estimates are more accurate with the inclusion of clock tree buffers.
Physical prototypes are used to validate timing budgets, area budgets, IR drop, congestion, and pin locations. The
feedback from physical prototyping back to design planning contains accurate timing abstractions (for refining
budgeting at top-level), power models (for top-level IR-Drop analysis), and congestion hot spots, which need to be
addressed by relocating pins or hard macro placement.

The top-level physical prototype will provide feedback on top-level timing closure, routing congestion, and required
channel area for buffering both clock and signal nets.

As the design becomes more and more defined, the loops between the design planning stage and prototyping will
converge. Once all blocks and the top-level are defined, the SoC designer is ready for implementation.

Sign-off is the delineation between the design refinement process described above and the final implementation. It has
changed over time to accommodate the new requirements associated with DSM process technologies. In the past, a
netlist hand-off was sufficient and provided a reliable interface between logical and physical design. As we have seen
in the previous chapter, a netlist generated by RTL synthesis is no longer the final netlist. Instead a prototype containing
an optimized netlist and a coarse or even final placement are used to sign-off the design prior to final implementation.

Implementation completes the process by transforming the prototype into a final physical layout. Implementation
operations include detailed logic optimization, placement, and routing. Throughout the process, the design is being
continuously monitored for timing, power, clock skew and delay, IR drop, and signal integrity. Once the blocks are
finished, top-level assembly is done. Since the block-level implementations were driven by top-down constraints, top-
level surprises are eliminated.
As mentioned above the starting point for final implementation can be a prototype with a course placement, in this
case the final implementation proceeds using the same technology as was used to generate the physical prototype
with progressively smaller and smaller bins. At each bin level, congestion, wirelength, and timing optimizations are
incrementally run. If the starting point for implementation is a final placement, then the implementation stage
proceeds with the routing and adjusts the placement as needed.
Accurate abstractions of completed blocks are needed to perform top-level assembly and sign-off the design for
tapeout. Timing models should include interface parasitics, account for signal integrity, and should be able to consider
timing exceptions on nets that cross block boundaries. Physical models should correctly represent embedded wide
wires, via cuts near the boundaries of blocks, antenna models, and electromigration effects.
Top-level clock tree synthesis plays an important role in reducing hold violations. At the top-level, clock trees are
synthesized such that skew to each block input is adjusted to account for the insertion delay inside the block. The top-
level setup and hold violations can be identified and fixed with block timing abstracts generated using propagated
clocks. The skew to each register connected to a block-level clock pin will be included in the timing abstract if a
propagated clock is used during abstract generation. At the top-level, setup and hold violations between clocks can be
identified and addressed.

You might also like