Cdnlive 2005: A Formal Approach For Pci Express Validation With Ifv (Incisive Formal Verifier)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

A Formal Approach for PCI Express Validation

with IFV (Incisive Formal Verifier)


CDNLive 2005

Salem Emara, ATI Technologies Inc.


Lawrence Sasaki, ATI Technologies Inc.
Wayne Wu, ATI Technologies Inc.
semara@ati.com, 905-882-2600, X3774
lsasaki@ati.com, 905-882-2600, X3461
wwu@ati.com, 905-882-2600, X3077

Abstract

Since the introduction of PCI express, design and validation have been both challenging and laborious. The
difficulty arises from its inherent complexity as well as the communications between the multiple designs teams
involved.

The verification infrastructure is built on an Intel bus functional model (BFM) integrated into the design
verification (DV) environment. PCIE validation takes the advantage of its three functional layers, namely physical,
link and transaction layers, using directed tests, targeted random and random regressions. Compliance tests are also
used to insure PCISIG compliancy of the design. Coverage metrics have been used to measure our test coverage.

Despite these efforts, there are still test escapes that were found on the post silicon parts in the lab. This paper
introduces an assertion based verification approach tailored to PICE. Using assertions in combination with
simulation and formal verification, it enhances the design quality as well as reducing the debug effort.

This paper demonstrates the advantage of IFV on enhancing the stability and quality of a PCIE core when used in
conjunction with a traditional verification flow. In order to demonstrate the benefits, a part of the receiver block is
presented as an example design with the associated verification methodology. The Incisive Formal Verifier has
been beneficial in static functional coverage of random logic, state machines and interfaces of the PCIE express
design early in the design flow.
1 Introduction
PCI Express is a high speed, general purpose differential serial link. It has been adopted by the personal computer
industry and system integrators as the general system interconnect replacing the legacy PCI and AGP busses. PCIE
is a scalable, packet-based protocol. Applications that require smaller bandwidth can use one lane. Other higher
bandwidth applications such as graphics processors can use up to 16 lanes. The specification allows a maximum of
32 lanes wide to be used. The PCIE protocol is specified in [1], [2], and [3].

This paper presents a validation enhancement of PCIE design using static functional verification. Section 2 presents
the PCIE verification challenges. Section 3 describes the basic methodology and section 4 presents some examples.

TL Layer TL Layer

DL Layer DL Layer

PHY Digital PHY Digital

PHY Analog PHY Analog

Figure 1 PCIE Functional Layers


In this paper, the term “the tool” refers to Cadence IFV. “PCIE” or PCI Express is used interchangeably. Basic
understanding of PCIE has been assumed. Some of the PCIE features have been mentioned but may not be clarified.
Readers interested in the PCIE technology can log onto www.pcisig.com for further details.

2 PCIE Verification Challenges


The transition to PCIE presents a new challenge for both designers and DV engineers. The first challenge is the
shear size of the specification. Many features were not clearly defined. As a result, verification effort was massive
and consumed a lot of CPU and test writing time. Due to the complexity of the protocol, the problem was
determining how much verification is adequate to cover the feature set. The following are some of the obstacles
encountered during the PCIE verifications process.

• Protocol Complexity and Change of Specification: Converting the PCIE spec to DV/test requirements was a
great challenge. Requirements came in mainly the form of a PCIE check list which consisted of 1400 items [4]. The
PCIE check list suffered from a lack of details and missing links to the specification. Changing specifications have
introduced uncertainty to the validation effort requiring constant churn. The PCIE Spec has moved from pre-1.0 to
1.0, 1.0a, 1.1 and now 2.0.
• Undefined Interfaces between Protocol Layers: As shown in figure 2, the PCIE protocol stack has three
defined layers. However, the interfaces between these layers were not clearly defined. An early effort has been made
to define the interface between the PHY layers and the DL/TL layer [5]. As the result, the definitions of the PIPE
interface has simplify third-party PHY integration and shorten the simulation time for DL (Data Link Layer) and TL
(Transaction Layer) validations.

• Clock Boundaries: The PCIE design has introduced many new clock domains into the existing design such as
the 100 MHz reference clock, the 250 MHz transmitter clock as well as the data recovery receiver clock per lane. In
addition, during reset, only the reference clock exists.

• Feature Requires Long Simulation Time: Many PCIE features and state transition validations require
excessively long simulation cycles. Certain time out conditions in the link training state machines could consume a
lot of simulation cycles

Figure 2 PCIE Clock domains

• Link Training State Machine: The LTSSM has a large number of states and arcs. The specification has
defined a limited number of states, but the actual implementations of the state machine could be more than 50 states.
All the state transition arcs require validation.

• Packet Corruptions: There are numerous ways a packet in a TLP can be corrupted. The relative start location
of the packet with respect to lane number, the packet type and the packet size are few good examples of type of
parameters that can be varied for the corrupted packets.

• Link Noise: Some logic in the PHY logical layer has to interface with the analog macros outputs directly. There
are many situations that the output from the PHY analog macros is outside the defined symbol space. For
example, receiver in low power saving state, the time laps till the idle detector detect the idle condition in case
of missing IDLE order set, the PHY output is undefined 10 bit random symbols. Losing symbol lock due to
noisy link is another example of random signals generated from the PHY. The symbol alignment logic is a good
example of such logic. Dealing with random noise in term of validations is extremely difficult task and requires
a lot of effort to handle it.

• The Limitation of Test-Bench: Most of the PCIE validation is done by using a third party VIP (Verification
IP). Some of the verification IP has it own limitations, preventing it from triggering corner cases in the design
under test.

functional Model (BFM)


Verification IP or Bus

Figure: Typical PCIE test bench topology

3 Validation Methodology
Regardless how the PCI Express is implemented, at a low level of abstraction, the PCI Express protocol can easily
understood as a state machine. Due to the complexity of PCIE, the state space that needs to be verified is very large.
We began with a traditional approach using functional verification and random regression, but too many corner
cases were not being covered. IFV was added to provide additional coverage.

3.1 Functional Testing


Functional testing is done to establish that the implementation realizes all functions of the PCI Express specification.
In addition, the implementation recovers from various erroneous inputs that are inconsistent with the specification.
For PCIE, functional tests include:
• Reset Test: Verifies that the data paths are cleared, the state machines are properly initialized, and so on.
Also that the start-up initialization sequence is followed.

• Packet Transmission: Verify that data are transmitted properly across the links in the various link formats.
Corner cases need to be evaluated, i.e. back-to-back, Start symbol in any legal lane for multi-lane configurations,
various packet sizes, etc.

• Error Recovery: PCI Express packet syntax rules define a number of error cases resulting from corruption of
the data across the links. The design needs to detect and report these error conditions appropriately.

• Power Management: A complex part of the design is negotiating through the power save states. A lot of test
resource is expended verifying this aspect of the design.

The problem with functional testing is completeness – “Has sufficient testing been done?” It is impossible to
exhaustively test all behaviors and parameters. There is always a possibility that some untried combination of
parameters and sequences would reveal a new behavior that is unacceptable. A two tiered approach is used here,
where exhaustive tests are to be carried out in the functional block level by the sub block designers, but even this is
not exhaustive.

Due to the complexity outlined above, another type of test is used here to establish that the control structure of the
implementation conforms to the structure of the specification. Implementation and specification have the same
structure if based on the same sequence of input; they model equivalent sets of states and allow for the same state
transitions.

3.2 Structural Testing


No data or parameters are considered in this type of testing. The emphasis is on the control structure of the PCI
Express protocol. The structural test can be achieved with the following requirements.

1. In any state, the layer under test (LUT) can accept and respond to all input symbols from all the vocabulary of
PCI Express protocol pertaining to that particular layer.
2. When a status message is received, the LUT responds with an output message that uniquely identifies its
current state without changing state.
3. When a reset message is received, the LUT responds by making a transition to a known initial state,
independent of its current state.
4. When a set message is received in the initial system state, the LUT responds by transition to that state
specified in the message.

Items two and four above can be easily implemented as functional registers, which can be accessed as register read
or write cycles. Aside from testing, four above can be used to expedite simulations as well. For instance, it can be
used to reduce the link training time to nothing at the beginning of a test.

The alternative to four above is a sequence of transition called state signature or unique input/output (UIO)
sequence. The UIO can determine whether the LUT is in a given state. Like wise, three above can be replaced by a
sequence of transitions, called the homing sequence that can bring the layer under test back to its initial state. The
best way to find UIO sequences is to derive all possible I/O sequences of a particular state and to check them against
the UIO properties.

The DV environment that will facilitate both the functional and structural testing can be constructed as the diagram
below. This will allow dynamic UIO generation from the LUT emulator itself and will greatly reduce the amount
validation effort involved.
T est
Sequ en ce

P C IE T ester IU T
S p ec

IU T
E m u lator

Figure 4 PCI Express DV Environment


The best way of validating a control structure is to find a sequence of state transitions that passes through every state
and every transition at least once. This is referring to as a transition tour. An ideal transition tour starts with a reset
message and exercise every transition exactly once, each time follow by a status message to verify the destination
state. Finding a perfect transition tour is also known as the Chinese Postman Problem (Kwan 1962). And is an
active research area for protocol design and validations. In short, the Chinese Postman Problem is to find a
minimum length closed walk that traverses each edge at least once. A walk is a path in which edges may be
repeated.

3.3 Assertion Based Verification


Due to the challenges mentioned in the previous sections, the Assertion-Based verification [6] becomes an enticing
validation technique for the PCIE. The PSL [7], [8] or SVA [9] are the most commonly used assertions languages.
Both languages are supported by many tool vendors. An earlier effort to use assertion was made by using the OVL
as a validation language. The PSL and SVA have broadened the platforms for assertions.

During the last two years the static functional verification tools have became more mature. Many ASIC companies
have adapted the static functional verifications and Assertion-Based process as part of their design flow. Static
functional verification with other tools for the validation of PCIE has been reported [10], [11], [12].

We now focus on few case studies using the Cadence IFV and PSL on validating the PCIE design.

3.4 Static Verses Dynamic Validation


Static functional verification is not capable of testing a design alone. We use it in conjunction with dynamic
simulation.

• Trade off between capacity and completeness


• Static verification is hampered by capacity constraints that limit application to small functional blocks of a
device
• Static methods yield a complete, comprehensive verification of the proven property, and are ideal for small,
complex units such as arbiters and bus controllers etc.
• Dynamic simulation, on the other hand, suffers no capacity limitations
• Many functional requirements whose state spaces are beyond the ability to simulate in a life time
o N inputs and M flops requires (2N)M stimulus
o A 10 input and 100 flop device requires 21000 vectors or 1.07150860718626732094842504906e+301
• Dynamic methods can not yield a complete verification solution because they do not perform a proof
3.5 Hybrid Method
A semi-formal method combines static and dynamic techniques in order to overcome the capacity imposed by static
method while addressing the inherent completeness limitations of dynamic methods. This is particularly useful on
validation large state machines.

Example:
• Postulate a rare, cycle distant LTSSM state to be explored by simulating forward from this state.
• The validity of this state can be proven by a suite of assertions.
• Once the state is fully specified, the device may be placed in the state using the simulator’s broadside load
capability. Simulation may then start from this point, as if we had simulated to it from the beginning.

3.6 Coverage Metrics


The completeness of validation is measured by coverage metrics collected from various aforementioned validation
techniques. Functional coverage is derived from the PCIE specification which defines an explicit specification
coverage space. Code coverage is one in which the coverage metrics are defined by the RTL and hence extracted
from the device implementation. Assertion coverage is simply functional coverage implemented using coverage
assertions.

In contrast to a simulation assertion, if an assertion is formally proved by IFV, IFV will report either it as proven or
provide a failure reason. Coverage of fully proven assertions can be reused with each new RTL release. Previously
proven assertions are invalidated so the proofs must be rerun.

Temporal assertions can be added or synthesized into large state machine, for instance the LTSSM, for static or
dynamic evaluations. The IFV proof engine can calculate the statues visited and arcs traversed while attempting to
prove the assertions.

The following coverage table indicates how the coverage metrics are partitioned at different stages of validation and
moving towards the design closure target of 99% plus coverage.

Metrics Coverage
Directed Tests 75%
Targeted Random Tests 85%
Total Random 90%
Dynamic Assertion 95%
Static Assertion (IFV) 99%

4 PCIE Validations Example


Two examples are presented here one from the receiver block and the other from the FIFO controller. These two
examples have been chosen due because they are difficult to validate using the classic techniques. Using IFV and
PSL improves testability and the robustness of the design.
4.1 Receiver block

A part of the receiver is presented here as an example of using IFV for validation. The CRC and the Packet parser
are statically verified together using IFV. Figure 5 shows the test bench used. Many of the interface signals are
not shown. These two blocks were selected since classical validation techniques fail to fully debug these two blocks
for the following reasons:

• Top level test benches using validation IP (VIP) could fail to trigger some corner cases.
• The number of required scenarios to be tested is enormous.

Figure 6 shows correct behavior and Figure 7 shows incorrect behavior. The number of planned directed test cases
to cover all the possible input condition is enormous. A local test bench is also used to verify this cluster of blocks.

The following assertions verify the interface using IFV and PSL.

• No more than one header for each Packet.


• No CRC status appears without header.
• Unknown CRC status, No Valid and Invalid CRC for the same Packet.
• No data after CRC status without header.
• No data before header.
• For each header there should be CRC status.

Figure 5 IFV test-bench for part of the receiver under test


Figure 6 Correct Condition, Header and Data have CRC results

Figure 7 Invalid Conditions, Header without CRC results.


Assertions were added to cover both correct and incorrect behavior of the design. This part of the design had been
thoroughly tested using a directed test cases, local test benches, and random regressions. As a result, no new bugs
were found using IFV, but we gained confidence in the design. This case of multiple modules driving a single block
is hard to verify, since each of the driving blocks has its own state machine and multiple concurrent state machines
could be out of sync.

4.2 FIFO controller

The FIFO controller is another candidate where the advantage of IFV can be demonstrated. Many FIFOs were used
in our PCIE design. They are mainly used for clock domain crossing, latency hiding etc. Amongst these, some are
standard FIFOs, while others are more complicated. For example, one used in the receiver has a roll-back function.
In a roll back FIFO, the write pointers are rolled back when certain conditions arise, such as a bad CRC. For packets
with a valid CRC, the FIFO’s write pointer advances. In the case of a bad CRC, the write pointer rolls back.

The following are some of the checks added using PSL and verified using IFV.

• FIFO full and empty conditions.


• Overflow and underflow mechanism.
• Overflow should not overwrite any other locations of the FIFO.
• Check of the roll-back mechanism.
• Check the FIFO configuration settings.
• Check all the gray code pointers at the clock crossing boundary.
• No missing cycles between the write and read interfaces. For each single write there should be only one
occupied location.
• When overflow and rollback happen in the same cycle, the FIFO controller should prevent the write.

At the time of the assertions were added, simulation revealed a new bug. IFV found this new bug and also other
corner case bugs. The new bug needed a specific sequence of events to show up. These bugs are difficult to uncover
with directed tests and may be exposed only with random regression. Static verification greatly expands the state
space and more exposes such bugs with ease.
Another functional feature using static verification is gray coding used for clock domain crossing. In our example of
a roll-back FIFO controller, to maintain a gray coding is difficult, and is best be verified using the static function
method.

5 Evaluating IFV for PCIE validations


IFV was initially evaluated using PCIE IP. The tool helped identify corner cases in the design. The adoption of static
functional validation has significantly improved design robustness. IFV was used to verify both old and new
modules.

1) IFV verification of old design modules revealed:


a) No new major bugs or issues.
b) Uncovered many new corner cases which have a low probability of showing up with traditional methods.
c) The tool resolved some of issues and suggested better design solutions. By running the tool to hunt for a bug, it
suggested multiple scenarios which could cause bugs to manifest themselves. So the tool not only uncovered the bug
but also influenced a better design.
2) Verification of new code:
a) A few fundamental bugs were discovered at earlier stages.
b) Designers became more sensitive to coding style and practices.
c) Some of the interfaces and design were re-architected to be assertion friendly.

The assertions were planned by senior staff. Then they were implemented and validated by junior engineers.
Summary of results after using the tool for almost 6 months is as follows:

• Not all the assertions in the design were validated by IFV.


• Around 40% of new assertions added to fresh design were tested by IFV.
• More than 50% of the debug time has been spent to filter out unrealistic case. This is not an issue as it forces
designers to write better assertions.

6 IFV Usage Model Recommendations and Suggestions

We offer a few recommendations and suggestions based on our experience using the tool on PCIE core validation.

6.1 Project Schedule

We found a 10-20% extra time added to the schedule. Most of the designers spend around half a day per week using
the tool. These are extra time and effort that should be added due to fact designers still familiar with the traditional
debugging techniques. Benefit is accrued further due to the improved quality of the design.

6.2 Assertions Test Plan

An assertion test plan is mandatory and should be added to the design flow. Early planning of assertions is good
practice. This preplanning helps in estimating the extra time required for the project.

6.3 Select the Assertion to be used by IFV

Not all assertions are tested by IFV as some are tested dynamically in simulation. Assertions for IFV should be
carefully selected. Avoid protocol level assertions. Some of protocol assertions involve many blocks for validation.
Examples of protocol assertions which should not be tested by IFV are:
• “For each received packet ACK/NACK packet should be transmitted”.
• “For each received packet eventually there are FC updates”.
• Any assertions involve multiple IP, models, or full chip; e.g. “for each packet transmitted by the IP, Flow
Control updates is required”.

These assertions are best used with dynamic simulations.

6.4 Candidates for IFV

IFV is best to verify the operation of control blocks so not all the PCIE blocks are good candidates. Following are
some suggested modules for validation by IFV.
• LTSSM: the link state machine one of the best candidates to be used with IFV. The most straight forward
assertions are dead state detection, one-hot checking. A lot of protocol checking assertions could be added.
• Modules with data streams with almost random nature, such as the PHY analog interface.
• Clusters of blocks with multiple state machines controlling another module. Such example presented in Figure
8.
• FIFO controllers.
• Receiver framers and packet parsing modules.
• Avoid blocks dealing mainly with the data path, blocks as CRC, it is a good idea to check the control logic but
avoid check the CRC generated values. The rule of thumb no data path checks for IFV.

6.5 IFV for designers

The best usage model for IFV is that it is handled by the designers themselves. Involving the DV team to validate
internal signals of modules is not good practice. The DV team can best help at the protocol or interface level.

Figure 8 Multiple State Machines controlling other module

7 Summary
PCIE validation is a challenging task for many. The protocol and its underlying design is complex and the validation
effort is extremely high. IFV will help to ensure various PCIE building blocks are stable and well validated at every
stages of the design cycle.

Normally random regressions try to discover corner cases for the design. However, random regressions are usually
run at the later stage of the design process and are often too late to take advantage of. By using IFV, we do see some
key modules have improved score on the random regression results compared to other projects without any IFV
runs.

Lastly, the hybrid approach has been proven advantageous on validation large state machine designs. In which a
distant state can be formally proved via IFV, while the targeted test can then follow on validating a particular feature
or end case from this state.
Author biographies

Salem Emara, (M.Sc). ATI Technologies Inc., Toronto Canada.


Staff Engineer. Worked on ATI’s first generation PCIE core which successfully provides the first PCIE graphics
discrete ASIC. During his career in ATI he has been working on many successful ASIC projects. Before his
involvement with the BUS IP group he was engaged with the display group to implement the Video Scalar and
VGA.

Lawrence Sasaki, (P.hD). ATI Technologies Inc., Toronto Canada.


Staff Engineer. Working on ATI’s next generation PCIE core being used in many product lines. Prior on joining
ATI, Lawrence was with Nortel for 20 years working a variety of projects: voice circuits, high speed SERDES,
logic design, architect, and SoC.

Wayne Wu, (M.A.Sc). ATI Technologies Inc., Toronto Canada.


Design Manager. Worked on ATI’s PCIE core, which was the first to demo the cutting edge PCIE solution at the fall
2003 IDF. Prior on joining ATI, Wayne was with Hewlett Packard for many years working on the Fiber Channel as
well as X-Terminal related chip set projects.

References:
[1] PCI Express Base Specification Rev 1.1, March 28, 2005. PCI-SIG.
[2] Ravi Budruk, Don Anderson, Tom Shanley, “PCI Express System Archtecture”, Addison-Wesley, 2004.
[3] Adam Wilen, Justin P. Schade, Ron Thornburg, “Introduction to PCI Express: A Hardware and Software
Developer's Guide”, Intel Press, 2002.
[4] Endpoint Compliance Checklist for the PCI Express Base 1.0a Specification, rev 1.0a, Sep 2004.
[5] PHY Interface for the PCI Express Architecture, Ver 1.0, June 2003.
[6] Harry D. Foster, Adam C. Krolink, David J. Lacey, “ Assertion-Based Design”, Kluwer Academic
Publishers, 2003.
[7] Property Specification Language Reference Manual 1.01. April 2003.
http://www.accellera.org/pslv101.pdf.
[8] Ben Cohen, Srinivasan Venkataramanan, Ajeetha Kumari, “Using PSL/Suger for Formal and Dynamic
Verification 2nd Edition”, VhdlCohen Publishing, 2004.
[9] Ben Cohen, Srinivasan Venkataramanan, Ajeetha Kumari, “SystemVerilog Assertions Handbook”,
VhdlCohen Publishing, 2005.
[10] L. Loh, H. Wong-Toi, C.N. Ip, H. Foster, D. Perry. “Overcoming the Verification Hurdle for PCI Express”,
DesignCon 2004.
[11] DeJian Li, Jenny Yongmei Zhang, “Tips on Verifying a PCI-Express Design with Hybrid Formal
Verification – Magellan”, SNUG05.
[12] Ravindra Viswanath, Srikanth Vijayaraghavan, Using Open Vera Assertions to verify PCI Express
Protocol, SNUG05.
[13] Mei-Ko Kwan, Graphic Programming Using Odd or Even Points, Chinese Math., 1:273-277, 1962

Acknowledgment:
Special Thanks for Axel Scherer and Claude Beauregard from Cadence for the help during the work on this paper.

You might also like