Professional Documents
Culture Documents
Emulation
Emulation
Price/Gate
Initialization and Dedicated Support
Capacity
Primary Target Designs
Speed Range
Partitioning
Compile Time
Visibility
Debug
Virtual Platform API
Transactor Availability
Verification Language and Native Support
Number of Users
Memory Capacity
----
----
Los
----
----
----
----
1. Price/Gate
The actual cost of an emulator typically runs from
1-5 cents-per-gate for higher capacity emulators -both processor-based and FPGA-based.
There is usually some recurring cost for software
and maintenance.
processor is implemented
in an ASIC-structured
custom fabric. Mentor Veloce's
processor is
implemented in a custom FPGA fabric.
- Category 2. Emulation with a standard FPGA
product at its
core. Synopsys-EVE is currently
the player in this sector.
- Category 3. Other emulators with HW based on a
standard
FPGA product. The primary differentiator
between category
2 and 3 is capacity; however,
there are other differences
as shown below.
Aldec, Bluespec, Cadence RPP, Dini Group,
S2C, and
HyperSilicon are primary vendors in this segment.
The emulator best suited to the designer problem is
defined by what problemthey are trying to solve.
Dedicated
Support
Design
Capacity
Primary
Target
Designs
Speed range
(cycles/sec)
yes
mixed
Claims up to 1
Claims up to 50+
Claims up to 2
billion. Typical
million. Typical
billion. Typical usage
usage 25 M to 200 usage 2 M to 25 M
100 M to 1 B gates.
M gates.
gates.
SoCs 100 M to 1 B
gates. Large CPUs,
IP blocks, subSoCs from 25 M to
GPUs, multi-chip
system, and SoCs
200 M gates
systems, application
from 2 M to 25 M
processors.
100 K to 2 M
10-30 M gates/hour.
Single workstation
(Palladium). PC farm
Compile time (Veloce). Includes
automated
partitioning time.
Parallelizable: Yes
500 K to 5 M
25 M - 100 M
gates/hr for PC
farm. Proprietary
software for fast
FPGA partitioning,
synthesis and P&R.
Parallelizable: Yes
Partitioning
automated
automated
Visibility
static, dynamic
probes. at-speed
probe capture.
Debug
Virtual
no
Breakpoints,
Breakpoints,
assertions,
assertions,
simulation hot-swap, simulation hotSW debug.
swap, SW debug.
Yes
Yes
500 K to 20 M
1 M - 15 M gates/hr
for PC farm.
Constrained by
FPGA vendor
synthesis and P&R
times. Doesn't
include partitioning
time. Parallelizable:
Yes
semi-automated.
Partitioning depends
on # of FPGAs.
Time range 30 min
to 4 hours.
static, dynamic
probes (vendor
dependent). at-speed
probe capture
(vendor dependent).
Breakpoints,
assertions,
simulation hot-swap,
SW debug.
varies by vendor
platform API
Standard/off-theStandard/off-theStandard/off-theTransactor
shelf: Good.
shelf: Mixed.
shelf: Good. Custom:
Availability
Custom: developed Custom: developed
developed ad hoc
ad hoc
ad hoc
C++, SystemC,
Verification
Specman e,
Synthesizable
Synthesizable
Language SystemVerilog,
Verilog, VHDL,
Verilog, VHDL,
Native
OVM, SVA, PSL,
System Verilog
System Verilog
support
OVL
Memory
up to 1 TB
up to 200 GB
up to 32 GB
Users
1 to 512 users
1 to 49
1 user
Here is my quick summary of the different emulation
vendors for 2013.
Category 1:
- Cadence Palladium. Hats off to Cadence for being
pioneers in
emulation and sustaining innovation to
maintain a very competitive
product year-overyear.
- Mentor Veloce. Their revenue numbers show
emulation is a growing
segment for them. (See
ESNUG 510 #7.) Clearly Wally and Greg
have been
investing heavily in emulation.
Category 2:
- Synopsys EVE Zebu. This has been the choice for
companies and
design groups doing mid-size SoCs or
blocks for emulation. It
is no secret that Intel
was an EVE customer. (See ESNUG 508 #6.)
My expectation is that with the Synopsys
acquisition, EVE will now
move upstream to
challenge Cadence and Mentor at the high end.
Category 3:
----
----
----
----
----
software simulation,
simulation acceleration,
FPGA prototyping, and
emulation.
----
----
----
----
----
---SOFTWARE SIMULATORS
A simulator is a software program that simulates an
abstract model of aparticular system by taking an input
representation of the product or
----
----
----
----
----
---SIMULATION ACCERLERATION
Simulation acceleration implements a hardware
description language, such
as Verilog or VHDL, according to a verification
specification. The results
are the same as the simulation, but faster.
- Often simulation accelerators will use hardware
such as GPUs
(i.e. NVidia Kepler) or FPGAs with embedded
processors.
- Simulation acceleration involves mapping the
synthesizable portion
of the design into a hardware platform
specifically designed to
increase performance by evaluating the HDL
constructs in parallel.
The remaining portions of the simulation are not
mapped into
hardware, but run in a software simulator on a
PC/workstation.
- The software simulator works in conjunction with
the hardware
platform to exchange simulation data.
Acceleration removes most
of the simulation events from the slow PC
software simulator and
----
----
----
----
----
---EMULATORS
An emulator maps an entire design into gates or Boolean
macros that are then executed on the emulator's
implementation fabric (parallel Boolean
processors or FPGA gates) such that the emulated
behavior exactly matches the cycle-by-cycle behavior of
the actual system.
- Processor-based emulator. The design under test
is mapped to
special purpose Boolean processors.
- FPGA-based emulator. The design under test is
mapped to FPGA
gates as processing elements.
Elsewhere in this report, I go into more detail on
emulation including:Emulation drivers; Metrics to
evaluate emulation; and a top-level comparison
chart of commercial emulation systems against those
metrics.
-------
----
----
----
----
----
FPGA PROTOTYPING
An FPGA prototype is the implementation of the SoC or
IC design on a FGPA. The protype environment is real,
with real input and output streams. Thus
the FPGA prototyping platform can provide full
verification for hardware, firmware, and application
software design functionality.
Some problems associated with FPGA prototyping are:
- Debug Confusion: Because you mapped your design
into an FPGA, you
can expect to spend some extra
time debugging it, to identify
problems that are
relevant ONLY to your prototype, but that are
not necessarily bugs inside your actual design.
- Partitioning: Your design must be partitioned
across multiple
FPGAs. Further, sometimes
repartitioning may be necessary when
design
changes are made. Partitioning challenges can also
apply
to emulation, so I discuss them in the
emulation metrics section.
- Timing (Impedance) Mismatches: If your FPGA
prototype connects to
real world interfaces, such
as Ethernet or PCIe, then you have to
ensure that
it is capable of supporting the interface. That is,
mismatched timing can sometimes be a problem. This can
involve
"speed bridging" to an FPGA.
If your design can fit into a few FPGAs, and you have
adequate support, then FPGA prototyping can be very
effective -- especially when real-time performance is
vital.
----
----
----
A BASIC COMPARISION
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
2012
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
Source:
Aldec
- The first board has 6 FPGAs; each FPGA is a
Virtex-6 LX550T
capable of supporting 4 M gates
for a total board capacity
of 24 M gates.
- The second board has only 2 FPGAs; each FPGA is a
Virtex-77V2000T capable of supporting 14 M gates for a
total board
capacity of 28 M gates.
Cutting a design in half is much easier and safer than
cutting a design into6 parts. This lessens a major
adoption obstacle for FPGA-based emulators,helping them
to close the gap with custom processor based emulators;
as wellas making emulation more attractive for smaller
designs.
Higher capacity FPGAs also push down the cost of the
emulation systems; forexample, some systems now use
off-the-shelf FPGAs at $4 K per FPGA, reducingtheir
core hardware cost. (Further, lower capacity emulators
can often addnext generation FPGAs shortly after they
are released by Xilinx and Altera.)
----
----
----
----
----
----
----
Speed Cycles
Granularity
Computational
per
per Sec
(# of comp
Element
comp (100 M
elements)
element gates)
SW
Simulation
X86 cores
Simulation
acceleration
GPU
processing
elements
under 16
100's
millions
3 GHz
under 1
10 to
1,000
1 GHz
NVDIA:
17x
under 1 40 to
GHz
10,000
Vendors
Cadence
Incisive/NCSim,
Synopsys
VCS,
Mentor
Questa
Rocketick
Cadence
Palladium
XP + RTL
Sim
Mentor
Veloce +
RTL Sim
Synopsys
Zebu Server
+ RTL Sim
Processor
-based
emulation
custom
processors
FPGA
-based
emulation
FPGA gates
FPGA
prototyping
FPGA gates
100 K to
2M,
processor
It is actually
based
100s of
under 1 scales
Cadence
1000s to
GHz
better
Palladium
millions
with
design
size than
FPGA
500 K - 2
M, does
not scale
well with
design
size, so at
100MG to Mentor
1 MHz reach the
Veloce,
millions
100
M range Synopsys
MHz
is very
EVE-Zebu
unlikely.
Debug
causes
further
slow
down.
Synopsys
HAPS,
2M - 50
Cadence
1 MHz M,
RPP, DINI,
millions
100 sometimes Aldec, S2C,
MHz
up to
HOENS,
100M
Hitech
Global,
ProDesign
Granularity
Number of
users
Speed
Capacity
Palladium P64
4 MG to 256 MG
resulting in much better
utilization
64 users
Veloce 2 Quattro
16 MG to 256MG
16 users
1-1.5 MHz
2 MHz
(as per datasheet),
(as per datasheet)
degrading with design size due
scaling with design size
to architecture
256 MG nominal
256 MG nominal
90% to 100% utilization
60% to 75% utilization
=> 256 MG actual
=> 200 MG actual capacity
capacity