Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

ASSIGNMENT NUMBER – 1 ( SOC)

ANKIT RAJ

ROLL NO 94

BVDU COLLEGE OF ENGINEERING PUNE


System on Chip ETC84CA ENTC Sem 8

Q.1) What are the challenges SOC design teams are facing?

To design a System-On-Chip (SoC) in 90nm Technology designers simultaneously juggle the dual challenges of
controlling both the macro level with big complexity issues and the micro level with small physical issues whilst
keeping to the overall constraint of Time-to-Market (TTM) in order to get a return on investment.

The complexity at 90nm is daunting. A 10x10mm die will be able to contain huge SoC functionality. An example of
a typical 100mm2 design could be:

Logic 10M Gates (20 x 0.5Mbit blocks, inc CoreWare)

Memory 39Mbits (200 instances: 1,2,4 Port Sync)

Package Flip-Chip 960 (with matched pairs)

Clocks 25 (%/MHz: 50/100, 25/200, 25/400)

CoreWare 4 off MIPS5Kf + peripherals,


ARM7 + peripherals,
2x40Gbit SFI5/SONET I/F,
10Gbit Ethernet/XAUI I/F,

Table 1: Example possible 90nm SoC complexity

2 Micro Level Issue


Physical issues that designers face increase dramatically below 0.18um. At 90nm the SoC designer is faced with an
array of issues. Key amongst these are: power drop, instantaneous voltage drop (IVD), clock, Crosstalk, and
reliability issues (electromigration, yield enhancement). All such design integrity issues must be solved for timing,
area and power simultaneously to get a working overall solution.

2.1 Changing Nature of Delay


Over recent technologies the nature of delay has changed from being within the cells to being within the
interconnect technology.
Fig 1: Relative cell to interconnect delay

Switching from Aluminium/SiO2 to Copper/Lowk has helped reduce this effect but in 90nm interconnect delay will
dominate with approximately 75% of the overall delay [Ref 1]. Thinner tightly packed interconnect are the root
cause for many of the micro level issues discussed below.
2.2 Power Drop

Fig 2: A Flip-Chip power mesh

Ensuring an adequate power mesh is one of the biggest issues. At 90nm only 1v is available in the core so for less
than a 5% voltage drop only 50mv drop is allowed across the mesh. The mesh construction is highly dependent
upon the number of metal layers, sub-module and memory placement and package type. LSI Logic uses in-house
tools to automatically generate a correct by construction power mesh.

Instance based power estimation techniques are used to analyse IR drop to ensure requirements are met. With
such little headroom for voltage variation implementation of power mesh will be key.
2.3 Instantaneous Voltage Drop
Peak dynamic power usage, already important at 130nm, will be essential at 90nm. Instantaneous voltage drop
(IVD) issues will require close analysis and the insertion of local on-chip capacitors to avoid issues resulting from
excessive noise on the power mesh. Areas of high power usage within the die, especially memory, PLLs and clock
drivers, will have to be handled very carefully in this respect. LSI Logic uses in-house tools to pre-place on-chip
capacitors close to these blocks to avoid IVD failures. In addition on-chip capacitor are also added post-placement
in order to reduce effects of IVD on the die. The amount of capacitance added depends on the switching activity
(frequency) of the die and the types of cells used.

Fig 3: Concept of the IVD avoidance


Another method of reducing IVD now used is by replacing standard flip-flops with slower switching versions during
the physical optimisation step of timing closure where the paths have sufficient slack time to stand this. Special
Flip-flops are designed for the library specifically for this purpose.

2.4 Clock
At 90nm clock delay and skew will be very difficult to control. The best flows will be based around automated useful
skew techniques and will control delay through branches of the clock by adjusting delay via post-clock insertion
delay cell swapping. LSI Logic uses "lsimrs", its physical optimisation tool, to insert clock trees with useful clock
skew. Clock crosstalk avoidance (via signal wire isolation) is built into such tools in order that the clocks are not
aggressors or victims to nearby signal nets.
A side benefit of useful clock skew will be to somewhat reduce IVD issues on the die by spreading the clock edges
along different clock branches.
Fig 4: Graphical description of Crosstalk
2.5 Crosstalk
Crosstalk already a common signal integrity issue at 180/130nm, yet often ignored in many SOC flows today, will
become critical at 90nm. Crosstalk is caused when an aggressor net running parallel to another victim net causes
false switching (noise) or altered timing (delay) on the victim net. Careful analysis, particularly of the delta timing
caused by the delay effects, takes roughly two weeks for a 3M gate design. This directly affects layout turn-around-
time.
An alternative flow that LSI Logic uses is to add crosstalk avoidance placement/optimisation tools to add margin to
the wire delays calculated in the layout tools and SDF timing file (lsidelay tool) in order avoid having to run these
crosstalk avoidance analysis tools at all. This does not work for all designs since those pushing timing cannot stand
the extra margin. In this case these extra margins have to be overridden and the extra crosstalk analysis tools are
run instead.

Fig 5: Crosstalk Avoidance Flow

Automated avoidance during routing will eliminate such issues when these tools truly come on-line but such tools
are not available today.
2.6 Reliability Issues
Many of the reliability issues seen at 130nm are already addressed via tool automation and methodology changes.
These include:
• Metal antennae effects - where an electron charge can build up on long nets during manufacturing and blows
up the transistor connected to it. Avoided by inserting diodes or adding metal jogs to the routing to force a layer
change. The latter can cause many extra vias in the layout which has it's own reliability issues if not carefully
controlled.
• Metal Slotting effects – this is where wide wires cause "metal dishing" effects due to processing limitations.
Avoided by splitting wide wires.
• Simultaneously Switching Outputs (SSO) – where noise is injected into the power rails from many output
changes at the same time and causes false signal values. Avoided by adding power/ground pads and by I/O isolation.
• Soft Errors – Alpha particles, both naturally occurring and from lead in packaging, can cause state inversion of
a flip-flop or memory element. With shrinking technology the charge induced becomes more significant. Avoided
by hardened flip-flops, error correction built into the memories and by fault tolerant system architectures.
• Memory yield – With memory taking an ever-larger proportion of the die, roughly 60% in the example above,
overall good die per wafer will be lower than with pure logic. Avoided by adding redundant rows/columns and using
Built-In Self Repair (BISR) with the larger embedded memories.
2.6.1 Electromigration
Electromigration (EM) is a key reliability effect that will worsen in 90nm. EM is caused by decreasing metal widths
and increasing current density. When overstressed metal ions tend to migrate over time eventually causing the
connection to break. LSI Logic runs "lsisignalem" after placement to set routing rules to ensure that metal and via
structures are robust enough to avoid the EM issues that can occur on signal nets. Post route checking is also
performed to ensure that the avoidance was successful.

Fig 6: Electromigration avoidance

2.7 Timing Files


One of the "small" issues that are not under control in all flows today is that of accurate delay calculation. Metal
variation at 90nm will cause a vast difference in both resistance along the wire and capacitance between the wires.
The overall max/min delay numbers are a complex equation of rise time along the nodes varying with R and C,
where the worst case R and C does not necessarily give the worst case delay numbers. LSI Logic uses "lsidelay" to
generate accurate golden timing information from the RC data, which may be run on multi-processor machines for
speed. Generating real best and worst case numbers from extracted R/C data is a non-trivial task where over-
simplified algorithms will start to fall apart in 90nm. The tool can also handle varying PVT (Process/ Voltage/
Temperature) and other factors that affect the overall timing.

2.8 Metal Stack


Another more physical issue, not under control in all processes today, is that of the manufacturability and reliability
of the copper/Low-K metal stack [Ref 2]. At 0.18um LSI Logic qualified Low-K with an aluminium metal stack. The
low-K dielectrics gave huge benefits in terms of reducing the effects of coupling capacitance of the inte rconnect.
At 0.13um LSI Logic used both Low-K and a copper metal stack. Switching from aluminium to copper has been a
steep learning curve for the industry but having got this under control moving to the 90nm technology node will
be relatively straightforward since the same basic materials will be used in the metal stack.
3 Macro Level Issues
When dealing with the "big" complexity issues, SoC design teams are being forced to face new challenges of
defining and fixing system architectures based around truly market-available IP and then integrating in-house
designed blocks as needed to complete the functionality. Controlling the "big" boils down to: picking the right IP to
suit the architecture (and vice versa), developing a solid software and early hardware verification strategy,
performing early RTL analysis on developed code, early physical planning, a complete test strategy all coupled
closely with tough project management and business skills.
3.1 Physical RTL optimisation
Physical RTL optimisation analysis is now being recognized by the industry as an important tool for SoC designers,
with a variety of EDA tools becoming available. Such tools comprehend the physical implementation of the RTL and
give early feedback as to poor RTL constructs that will cause problems in layout.

Fig 7: Early RTL analysis gives project control

Good RTL architecture and coding can save many man months in project timescales. The RTL analysis tools within
LSI Logic's FlexStreamTM design flow perform fast synthesis, partitioning, placement, and timing analysis of an RTL
block and provide detailed information about that block.
Such a tool highlights issues in the RTL that are likely to cause problems for the layout tools later in the flow. LSI
Logic rules built into the tool specifically highlight RTL constructs that have caused problems in the past. Designers
armed with this knowledge can then modify the architecture and coding of the RTL to avoid such issues.

One example of typical issues is RTL that infers a huge mux'ing function, common in communication switch SOC's,
which will be difficult to layout. One alternative would be to split the mux'ing function in a different way. A second
example is that of a controller block that is shared between two sub-modules and is in the critical timing path for
both modules. One solution to this is to duplicate the controller function locally to each.

The best RTL Analysis tools therefore provide an idea of the physical issues that have been inferred in RTL code
even before floorplanning is started. They provide very fast feedback on how to optimise the architecture and
coding which is linked directly back to the source RTL code, in a way that early floorplanning/placement tools simply
cannot.
3.2 Floorplanning
Early physical planning of big SoC designs is a pre-requisite. An early floorplan showing location of the high speed
I/O, block and memory location quickly gives an idea of the feasibility of the physical design and goes one stage
further than the RTL Analysis tools. For example, the SFI5 physical layer interface in the design example is complex
- 16 differential pairs making 40Gbit/s (16X2.5Gbit/s) - and requires careful placement on the die, the package and
the board. Such system level skill sets are non-trivial and highly sought after in order to drive products quickly to
market with low risk. Floorplanning a 10Mgate design requires detailed routing of global signal and clock nets at
this early stage in order to control time of flight and define timing and area budgets for the block. Modern tool
flows, such as the FlexStream Design System, allow hierarchical design approaches for each of these sub-blocks but
it is controlling timing closure early at this top level that is the key to a fast turn-around-time and eventually a
successful product.

Fig 8: Typical Floorplan at 0.13um


4 Cross-Border Issues
There is a further category of issues that crosses macro and micro levels, including test, overall chip
power/temperature and database size, that will challenge engineers in 90nm. Among the test issues: Traditional
"full scan" stuck-at fault coverage test strategies are starting to take too long in production testing and are
increasingly shown to have too many test escapes, IDDq testing is becoming less viable due to increasing transistor
leakage. Silicon vendors, EDA companies and research institutes are actively working on such issues and we are
likely to see fast evolving test strategies in the near future including Scan compression, Logic BIST, and transition
fault coverage.

Overall chip power will become an increasing focus for SoC in 90nm because die temperature has a direct effect
on failure rate and therefore the reliability of the SoC. Approaches used in battery-operated devices for years, such
as slow clocking and sleep modes as well as the more usual gated clocking, grey code addressing and memory
splitting will be widely used. EDA tools will have to truly consider the third axis of power (as well as time and area)
within the design flow.
4.1 Database Sizes
The last cross-border challenge to be highlighted is that of file and database size. An example of controlling
database size, and therefore turn-around-time, is that of the typical timing signoff flow today: SPEF files (RC data)
are extracted at chip level, then SDF files are generated using the silicon vendor's golden timing engine and an STA
tool will analyze this database. Final flat timing runs like this already take several days, each intermediate file taking
several Gbytes of data, and running only on a machine with a 64bit operating system. Short term, key tools such as
LSI Logic's delay calculator "lsidelay" that generates the SDF have been adapted to run on multi-threaded and multi-
processor compute farms. Longer term the industry will adopt methodologies such as OLA library models (a library
with a built in golden timing calculator supplied by the silicon vendor) and OpenAccess common databases such
that extraction, delay calculation and STA analysis can be accomplished in a much more efficient manner. Using a
single database into which all tools can plug will completely avoid the many intermediate files of varying formats
with differing limitations being required. [Ref 1].
Fig 9: File and database issues

In general, the management task of generating and controlling a machine, software and human resource
infrastructure to enable SoC design within time-to-market constraints could end up being the biggest challenge of
all. This is especially true as it involves the cross-industry collaboration of silicon vendors, EDA vendors and system
houses.

Q.2) What role processor plays as basic building block for SOC?
The arrival of affordable multiprocessing in embedded systems provides engineering teams with much more
flexibility than they have previously had. It is often easier to allocate tasks to different processors than try to
schedule all operations on just one fast CPU.
The move toward multiple processor system-on-chip (SoC) designs is very real. Multiple processors are used in
consumer devices ranging from low-cost ink-jet printers to mobile phones. Most of the newest network processors
are based on multiple processor designs. The CRS-1, a router designed by Cisco Systems, employs 188 processors
on a single chip, and multiple chips within the system.

A task-based analysis can show how multiple processors can be used efficiently in a system. Tasks that are mostly
independent can be allocated to different processors, with intertask communication handled using message
passing and shared-memory data structures (Figure 1). Each individual task that runs on a particular processor can
be accelerated through the use of custom instructions that are dedicated to the most common operations needed
by the task.
Figure 1: Simple heterogeneous system partitioning

If more performance is needed, the task can be decomposed into a set of parallel tasks (Figure 2) running on a set
of optimised, inter-communicating processors. Conversely, multiple low-bandwidth tasks can be run on one
processor by time-slicing them. This approach degrades parallelism, but may improve SoC cost and efficiency if the
processor has enough available computation cycles.
Figure 2: Parallel task system partitioning

The first stage is to determine the performance required of the system. If the tasks are represented as algorithms
in a programming language such as C, early system modelling can verify the functionality and measure the data
transfers between tasks. At this stage, tasks have not been allocated to processors, and communications among
tasks is still expressed abstractly.

An early abstract system simulation model serves as the basis for sizing the computational demands of each task.
This information is not exact, but can yield important insights into both computational and communication hot
spots.
Using system simulation throughout the design process has two advantages. First, an early start to simulation
provides insight into bottlenecks. Second, the model’s role as a performance predictor gradually evolves into a role
as a verification test bench. To test a subsystem, a designer replaces the subsystem’s high-level model with a lower-
level implementation model.
There are two guidelines for mapping tasks to processors. The first is that the processor must have sufficient
computational capacity to handle the task. The second guideline is that tasks with similar requirements should be
allocated to the same processor as long as the processor has the computational capacity to accommodate all of
the tasks.

The choice of processor type is important. A control task needs substantially more cycles if it is running on a simple
DSP rather than a Risc processor. A numerical task usually needs more cycles running on a Risc CPU than a DSP.
The combination of Risc processors and DSPs calls for the use of multiple software tools, which complicates
development.
Developers would prefer to use multiple instances of the same general-purpose processor. But many standard,
general purpose 32bit Risc processors are not fast enough to handle critical parts of some applications. The
standard approach to providing greater performance partitions the application between software running on a
processor and a hardware accelerator block, but this has serious limitations.

The methods for designing and verifying large hardware blocks are labour-intensive, error-prone and slow. If the
requirements for the portion of the application running on the accelerator change late in the design or after the
SoC is built, a new silicon design may be needed, further adding to cost. There may even be a performance hit.

Moving data back and forth among the processor, accelerator, and memory may slow total application throughput,
offsetting much or all of the benefit derived from hardware acceleration.

Ironically, the promise of concurrency between the processor and the accelerator is also often unrealised because
the application, by nature of the way it is written, may force the processor to sit idle while the accelerator performs
necessary work. In addition, the accelerator will be idle during application phases that cannot exploit it.

Configurable and extensible processors offer several advantages to accelerator design. First, it incorporates the
accelerator function into the processor, eliminating the processor-accelerator communication overhead. The
configurable approach makes the accelerator functions far more programmable and significantly simplifies
integration and testing of the total application. It also allows the acceleration hardware to have intimate acc ess to
all of the processor’s resources.
Further, converting the accelerator to a separate processor configured for application acceleration allows the
second task to run in parallel with the general-purpose processor, receiving commands through registers or
through shared data memory.

Once the rough nu mber and types of processors is known and tasks are tentatively assigned to the processors,
basic communication structure design starts. The goal is to discover the least expensive communications structure
that satisfies the bandwidth and latency requirements of the tasks.
When low cost and good flexibility are the most important considerations, a shared-bus architecture, in which all
resources are connected to one bus, may be the most appropriate. The liability of the shared bus is long and
unpredictable latency, particularly when a number of bus masters contend for access to different shared resources.
A parallel communications network provides high throughput with flexibility. The most common example is a
crossbar connection with a two-level hierarchy of busses (Figure 3).
Figure 3: General-purpose parallel communications style: on-chip mesh network

Traditional processor cores provide only the block-oriented, general-bus interface. Configurable and extensible
processors allow faster, more flexible communications using direct processor-to-processor connections to reduce
cost and latency (Figure 4).
The memory used by a system
introduces a further set of trade-
offs. Off-chip RAM is much
cheaper than on-chip RAM – at
least for large memories. The
designer needs to look at the
memory-transfer requirements
of each task to ensure that the
memory used can handle the
traffic. When on-chip RAM
requirements are uncertain,
caches can improve performance
and shared memories aid inter-
processor communication. But it
is important to watch for
contention latency in memory
access. Increasing the memory
Figure 4: Optimised direct parallel communications
width or increasing the number
of memories that can be active
may be used to overcome
contention bottlenecks.

Even though processors make a potent alternative to hardwired logic blocks, often RTL blocks have already been
designed and verified, so it is important to re-use them if appropriate. Two interface mechanisms to RTL blocks are
generally used. The first mechanism maps hardware registers into local memory space. This makes the hardware
block look much like an I/O device, and makes the controlling software look much like a standard de vice driver.
The alternative hardware interface mechanism that can be used is to extend the instruction set to directly stimulate
hardware functions. With configurable processors, the designer can specify new processor instructions that take
hardware block outputs as instruction-source operands and use hardware block inputs as instruction-result
destinations. This avoids the use of intermediate registers and greatly accelerates the task by eliminating I/O
overhead.

As designers get comfortable with a processor-based approach, processors have the potential to become the even
more powerful building blocks for next-generation SoC designs, and SoC designers will turn to a processor-centric
design methodology that has the potential to solve the ever-increasing hardware/software integration dilemma.

Q.3) ExplSoC acronym for system on chip is an IC which integrates all the components into a single chip. It may
contain analog, digital, mixed signal and other radio frequency functions all lying on a single chip substrate. Today,
SoCs are very common in electronics industry due to its low power consumption. Also, embedded
system applications make great use of SoCs.

SoCs consists of:

 Control Unit: In SoCs, the major control units are microprocessors, microcontrollers, digital signal
processors etc.
 Memory Blocks: ROM, RAM. Flash memory and EEPROM are the basic memory units inside a SoC chip.

 Timing Units: Oscillators and PLLs are the timing units of the System on chip.

 Other peripherals of the SoCs are counter timers, real-time timers and power on reset generators.

 Analog interfaces, external interfaces, voltage regulators and power management units form the basic
interfaces of the SoCs.

SoC Structure: Design Flow


Design flow of SoC aims in the development of hardware and software of SoC designs. In general, the design flow
of SoCs consists of:

 Hardware and Software Modules: Hardware blocks of SoCs are developed from pre-qualified hardware
elements and software modules integrated using software development environment. The hardware
description languages like Verilog, VHDL and SystemC are being used for the development of the modules.
 Functional Verification: The SoCs are verified for the logic correctness before it is being given to the
foundry.

 Verify hardware and software designs: For the verification and debug of hardware and software of SoC
designs, engineers have employed FPGA, simulation acceleration, emulation and other technologies.

 Place and Route: After the debugging of the SoC, the next step is to place and route the entire design to
the integrated circuit before it is being given to the fabrication. In the fabrication process , full custom,
standard cell and FPGA technologies are commonly used.

Advantages of SoC

 Low power.

 Low cost.

 High reliability.

 Small form factor.

 High integration levels.

 Fast operation.

 Greater design.

 Small size.

Disadvantages of SoC

 Fabrication cost.

 Increased complexity.

 Time to market demands.

 More verification.

SoC Varieties

 NVIDIA Tegra 3

NVIDIA Tegra 3 is a SoC of the Tegra family and this is used in various Android devices. Some devices like Asus Eee
Pad, HTC One X and Google Nexus Tablet is using the Tegra 3 on the board. This comes with a CPU and five cores.
Each core is a Cortex A9 ARM chip, while the fifth core is made of a low power silicon process and has a speed of
500MHz.

 Qualcomm Snapdragon S4

Qualcomm is important when Android smart phones and tablets are being used. It has a processor which is similar
to the ARM Cortex A15 CPU.

 Samsung Exynos 4 Quad

This SoC is based on the ARM architecture. It has a 1.4GHz ARM Mali-400 MP4 quad-core GPU and Quad-core ARM
Cortex – A9 CPU. This processor supports many applications like 3D gaming, multi-tasking and video recording and
playback.

 Intel Medfield
Medfield SoCs are not based on ARM architecture. It uses the x86 technologies to make these SoCs. Medfield SoCs
can offer OEMs a 1.6-2GHz single-core processor and PowerVR’s SGX540 GPU.

 Texas Instruments OMAP 4

It is the fourth generation OMAPs where ARM Cortex A9 45nm architecture is being used. Some Android devices
that use this SoC are Motorola Atrix 2, Motorola Droid RAZR, LG Optimus 3D and LG Optimus Max.

SoC Design Challenges

The different SoC design challenges are given below:

1. Architecture Strategy
2. Design for Test Strategy

3. Validation Strategy

4. Synthesis Backend Strategy

5. Integration Strategy

6. On chip Isolation

 Architecture Strategy

The kind of processor that we use to design the SoC is really an important factor to be considered. Also, the kind
of bus that has to be implemented is another matter of choice.

 Design for Test Strategy


Most of the common physical defects are modeled as faults here. While the necessary circuits included in the SoC
design help in checking the faults.

 Validation Strategy
Validation Strategy of SoC designs involves two major issues. First issue is that we have to verify the IP cores. While
the second issue is that we need to verify the integration of the system.

 Synthesis and Backend Strategy


There are many physical effects that have to be considered while designing the SoC synthesis and st rategy. Effects
like IR drop, cross talk, 3D noise, antenna effects and EMI effects. Inorder to tackle these issues, chip planning,
power planning, DFT planning, clock planning, timing and area budgeting is required in the early stage of the design.

 Integration Strategy
In the integration strategy, all the above listed facts have to be considered and assembled to bring out a smooth
strategy.

 On chip Isolation

In on chip isolation, many effects like impact of process technology, grounding effects, guard rings, shielding and
on- chip decoupling is to be considered.

ARM Holdings and SoC


System on Chip devices became popular because of some major breakthroughs provided by ARM Holdings, a British
company that has contributed significantly to the field of embedded systems. ARM developed and licensed
processor designs that could be used by other companies to develop chips. This enabled greater flexibility in design
and manufacture of chips. Chip manufacturers could build upon these CPU designs and add other necessary
components to come up with SoC.
IP Cores

IP Cores or Intellectual Property Cores are fundamental building blocks of SoC. It is a reusable layout of IC design
that is provided by companies like ARM to chip manufacturers subject to license agreements. IP Cores can be Soft
cores or Hard cores. Soft cores are generally RTL schematics written in some hardware description language. They
are called so because they can be subjected to small changes suiting the design. Hard cores are mostly analog
components and certain digital cores whose function cannot be changed by designers.
Open Cores

One of the advantage of having reusable IC design layouts is that it facilitates a more open approach to designing.
Having philosophies similar to the Free Software Movement an open source hardware community exists that
develops digital open source hardware. This community called Open Cores publish core designs under Lesser
General Public License (LGPL). Their aim is to develop tools and standards for open source cores and platforms and
provide documentation for the same.

Software and Protocol Stacks

Hardware is not the only focus during SoC design. The chips developed must be supported by software drivers that
control the operation of hardware. Since an SoC has to manage networking also the protocol stacks have to be
written along with drivers. These stacks are software implementations of networking protocols.
Functional Verification

Functional verification is a very important task in SoC manufacturing. It is the process of verifying that the hardware
developed follows the logic intended by the designer. This involve testing the performance of the hardware against
the various permutations and combinations of situations. The very number of such possibilities make this process
extremely challenging. Experts use various methods to reduce this number and take help of software tools like
Aldec at this stage.
On-chip debugging

On-chip debugging is emerging as a cheap alternative to simulative and emulative verification techniques.
Simulations are not very close to physical hardware and methods to improve simiulation capabilities can be very
costly. The cost can be brought down and more effective debugging can be ensured by the use of instrumentation
techniques for on-chip debugging. On-Chip debugging for a component in an SoC is different from others. We may
look into on-chip debugging of a processor as an example.

On-Chip Debugging of a Processor

This On-Chip Debugging System(OCDS) uses JTAG interface. It consists of three blocks – the OCDS module, core
debug port and the JTAG module. The debugging operation is controlled using breakpoints. Breakpoints are triggers
that alter the sequential operation of a processor. The processor is driven into required modes using breakpoints
for performing analysis.

Instruction Breakpoints

Instruction breakpoints are triggers set against instruction value of the processor. This helps in tracking the
occurrence of instructions in the processor. A processor OCDS monitor concurrent instructions by having multiple
instruction breakpoints. Instructions are evaluated by comparing the data or register values set by instructions.

Fabrication of SoC
Most common methods for fabricating SoCs are as a standard cell, full custom designing or using FPGAs. Full custom

X
Ankit Raj

desi gning involves specifying the layout of every component of


hardware design. Due to the labor intensiveness of this method it is preferred only when large number of repetition
is needed. A more common method is the use of standard cells which are libraries already written by full custom
designing. FPGAs allow implementation of complex combinational logical functions by a user with the help of
programmable logic blocks and interconnects.
Applications of SoCs

Most common use of SoCs has been found in the mobile devices industry. The use of SoCs ha ve enabled
manufaturers of such devices to come up with devices good of very small form factor that offer ample
performance. It also enables them to focus on features they project to the target customers than relying on
capabilities of chips provided by some other company. SoCs also brought about a revolution in embedded systems
by paving way for very small and portable single-board computers.

Examples of SoCs
Most of the SoCs available in the market today are ARM based. Some examples among SoCs in smartphone industry
are Qualcomm's Snapdragon SoCs, Apple A4, and Nvidia Tegra series. Raspberry Pi 2 comes with Broadcom
BCM2836 SoC. Several SoCs have been developed by the Open Cores community.

ain in detail various components of new SOC design flow?

You might also like