Datasheet

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

QL5032 Users Guide

(Preliminary Draft)
March 9, 1999

QL5032 Users Guide

TABLE OF CONTENTS
Setting up a QL5032 Project ______________________________________________ 1
Step-by-step Project Setup ____________________________________________________ 1
Step 1: Create a QL5032 Project Folder ________________________________________________ 1 Step 2: Copy Template Files to the Project Folder ________________________________________ 1 Step 3: Rename the Top-Level Design Block ____________________________________________ 1 Step 4: Select Mandatory Device Defaults ______________________________________________ 1 Step 5: Open the top-level design block and start designing! ________________________________ 2

Creating FIFOs in the QL5032 ____________________________________________ 3


Synchronous or Asynchronous ________________________________________________ 3 Building a Synchronous FIFO _________________________________________________ 3 Building an Asynchronous FIFO _______________________________________________ 5
COMPARATORS _________________________________________________________________ 6 GREY COUNTERS________________________________________________________________ 6 REGISTERS _____________________________________________________________________ 7

Adding a FIFO to your Schematic Design _______________________________________ 7


Creating a Schematic Symbol for the FIFO ______________________________________________ 7 Running the Symbol Creation Tool ____________________________________________________ 7

QL5032 Design Flow ____________________________________________________ 8


Pre-Layout Simulation _______________________________________________________ 8
Step 1: Create an HDL Netlist or an HDL Source File _____________________________________ 9 Step 2: Create a Test Fixture (or Test Bench) ___________________________________________ 10 Step 3: Set up the Simulation Project File ______________________________________________ 10 Step 4: Run the Simulation and Check Results __________________________________________ 11

Compilation _______________________________________________________________ 11
Step 1. Synthesize the design _______________________________________________________ 11 Step 2. Set up your Design Constraints________________________________________________ 12 Step 3. Logic Optimization, Placement, and Routing _____________________________________ 12 Step 4. Checking Design Timing _____________________________________________________ 13

Post-Layout (Timing) Simulation _____________________________________________ 13 Programming a QL5032 Device_______________________________________________ 13

Appendix A: QL5032 PCI Interface Functional Description ___________________ 15


Internal Port Descriptions ___________________________________________________ 15
Master _________________________________________________________________________ 15 Target__________________________________________________________________________ 16 PCI ____________________________________________________________________________ 17

Waveforms ________________________________________________________________ 18
Single-Dword Configuration Read ___________________________________________________ 18 Burst Configuration Read __________________________________________________________ 19 Single-Dword Configuration Write ___________________________________________________ 20 Burst Configuration Write __________________________________________________________ 21 Target Read _____________________________________________________________________ 22

QL5032 Users Guide

Target Write_____________________________________________________________________ 23 Master DMA Read________________________________________________________________ 25 Master DMA Write _______________________________________________________________ 27

Appendix B: DMA Controller Reference Design _____________________________ 29


Summary of DMA Features __________________________________________________ 29 DMA Operation Overview ___________________________________________________ 29 Description of Inputs and Outputs ____________________________________________ 29
Inputs __________________________________________________________________________ 30 Outputs_________________________________________________________________________ 30

Description of Registers _____________________________________________________ 30 Detailed Description of the DMACTRL.V file (Verilog) ___________________________ 31

Appendix C: Target Configuration Space and Address Register _________________ 41


PCI Configuration Space ____________________________________________________ 41 Using I/O Address Space ____________________________________________________ 42 Adding Base-Address Registers _______________________________________________ 42 Address Register/Counter ___________________________________________________ 44 Command Decode Logic_____________________________________________________ 44

Appendix D: Technical Support and FAQ __________________________________ 45


Online Resources___________________________________________________________ 45 Technical Support via email__________________________________________________ 45 Telephone Support _________________________________________________________ 45 FAQ for the QL5032 ________________________________________________________ 45

Glossary _____________________________________________________________ 49

ii

QL5032 Users Guide

Setting up a QL5032 Project


When designing for a QL5032 device, it is important to set up the top-level block of your design project properly, to avoid any problems with the simulation, synthesis, or compilation phases of the design. This chapter will explain the process you would use to begin any QL5032 design. QuickLogic recommends a design flow that consists of a top-level schematic. The reason this is recommended, is because this flow allows more seamless integration with QuickLogics toolset, and provides a graphical view of the interconnection of the top level design blocks. This design flow is not proprietary, because the top-level schematic can be netlisted as Verilog or VHDL. This top-level schematic is simply a graphical representation of the QL5032 PCI Interface core and other design components. If you want to design in VHDL or Verilog, you may simply add a block or blocks to this top-level schematic that represents your Verilog or VHDL file(s). Of course, you may also use a Verilog or VHDL file for your top-level design, and QuickLogic provides templates for that purpose as well.

Step-by-step Project Setup


Regardless of whether you intend on designing with schematics, VHDL, Verilog, or mixed schematic and HDLs, you should set up your QL5032 project according to the following process.

Step 1: Create a QL5032 Project Folder


This step consists simply of creating a unique folder on the hard drive for the QL5032 Project. You can use Windows Explorer (Start > Programs > Windows Explorer), or the My Computer icon on your Windows Desktop to create the project folder on your computers hard drive. One suggestion is to make a master folder for all projects in the root folder of your hard drive (such as C:\DESIGNS), and then make a specific project folder for each project or project revision you are working on (such as C:\DESIGNS\MYPROJ1).

Step 2: Copy Template Files to the Project Folder


For a QL5032 design, the files you should copy into your project folder are located in the installation directory for the QuickWorks software. You should copy all the files from the C:\PASIC\DESIGN\TEMPLATE\QL5032\VERILOG (or VHDL) folder into your recently created project folder. Choose the subfolder (VERILOG or VHDL) which corresponds to the HDL language you are most familiar with. If you prefer to design with schematics, choose the VERILOG folder. These files will give you a jump start on your QL5032 design, and minimize your design cycle.

Step 3: Rename the Top-Level Design Block


In order to reduce the number of QL5032 top-level designs called TOP, we recommend that you rename the TOP.SCH, TOP.V, or TOP.VHD file which you copied from the template folder to a name more unique to your project. A good suggestion is to use the same name that you used for your project directory. You can rename all three of these files, or just the file pertinent to your design flow. For example, if you intend on using a pure VHDL design flow (no top-level schematic), then you could just rename the TOP.VHD file to your project name, such as MYPROJ1.VHD.

Step 4: Select Mandatory Device Defaults


When designing a QL5032 device, there are a few device parameters that must be selected before beginning the design. These correspond to the unique identifying information within the PCI Configuration Space. For help concerning these Configuration Space parameters, you will need to consult the PCI Specification, version 2.2. This can be obtained for a fee from the PCI Special Interest Group, or your company may already have copies available. The Web site for the PCI Special Interest Group is www.pcisig.com. QuickLogic cannot provide copies of the PCI Specification. The parameters you will need to change can be found in the Target Configuration and Addressing Reference block (CFGSPACE). This can be opened with the Turbo Writer editor, or any text editor.

QL5032 Users Guide

Schematic or Verilog designers will be editing a Verilog file (CFGSPACE.V) , while VHDL users will be editing a VHDL file (CFGSPACE.VHD). The appropriate section of the Verilog (.V) file is shown here.
// *********** beginning of user-modifiable parameters ************ // PCI registers offset into config space wire [15:0] DeviceID = 16h0001; // 00h wire [15:0] VendorID = 16h11E3; // 00h wire [23:0] ClassCode = 24h020000; // 08h wire [7:0] RevisionID = 8h01; // 08h wire [15:0] SubsysID = 16h0001; // 2Ch wire [15:0] SubsysVendID = 16h11E3; // 2Ch wire [7:0] MaxLat = 8h05; // 3Ch wire [7:0] MinGnt = 8h02; // 3Ch wire [7:0] IntPin = 8h01; // 3Ch parameter BAR0_size = 24; // Sets the size of the requested memory space. // Default value is 24, corresponding to 16MB. // (# of bits to tie off in the BAR) // *********** end of user-modifiable parameters ************

You must edit the values in the Vendor ID, Subsystem Vendor ID (SubsysVenID) field and Class Code fields. You can obtain a Vendor ID from the PCI SIG. Anyone creating a PCI Interface Card must obtain this ID from the PCI SIG, so that all Vendor IDs are unique. IMPORTANT: You must not use the Vendor ID (or Subsystem Vendor ID) of 16h0000 or 16h11E3 (QuickLogics PCI SIG assigned Vendor ID). Doing so would violate the PCI Specification. The Class Code field simply identifies the type of PCI agent you are creating. For a complete description, see the PCI Specification, revision 2.2. For information on setting default values for other fields in the PCI configuration space, refer to the PCI Specification.

Step 5: Open the top-level design block and start designing!


If you prefer to see your top-level design block graphically, then you would open the top-level block in the schematic editor. First run the SpDE program from the Start menu: Start > Programs > QuickLogic > SpDE. Then, from SpDE, click Design > Edit Schematic. You then can browse to your project folder and open the top-level schematic (which you named in Step 3). For pure HDL designers, who would rather work with the language then a top-level schematic, open the top level .V(Verilog) or .VHD (VHDL) file (which you named in Step 3) in the Turbo Writer Editor. Open the Turbo Writer editor from the Start menu: Start > Programs > QuickLogic > Turbo Writer. Or open Turbo Writer from SpDE with Design > Text Editor. From the Turbo Writer menu, click on File > Open. Then choose VHDL or VERILOG from the List Files of Type drop down box in the lower left of the Open File window. Then you can browse to your project folder and select the top-level file name. Important note to Verilog and VHDL Designers: When you are designing your top-level block in Verilog or VHDL, you will be using a .SC file to specify the pinout of the device. The QuickWorks Users Guide provides all the information about the use of the .SC file. However, it is very important to add one line at the top of the .SC file that tells the Synplify-Lite synthesis tool to not buffer the internal clock and reset signals. This line is typed:
option quicklogic ql_nobuffer = 1;

For help with the design process or design flow, refer to the QuickWorks Users Guide. This can be found on the QuickWorks CD in the BOOKS directory (open BOOKS.PDF) or you may have received a hard copy of this manual with your QuickWorks software.

QL5032 Users Guide

Creating FIFOs in the QL5032


There are several reasons you might want to create FIFOs in the QL5032 device. 1. 2. 3. To buffer data off the PCI bus, so burst transfers can take place on the bus and maximum PCI performance can be achieved. To provide a synchronization buffer between the PCI clock and a different local clock which is asynchronous to the PCI clock. To provide scratch-pad storage for something like a list of DMA transfers, so that DMA transactions can be automatically executed by the chip in a linked or chained fashion.

Synchronous or Asynchronous
When building a FIFO in the QL5032 device, the first decision you must make is whether you want a synchronous or asynchronous FIFO. First, some definitions: Synchronous FIFO: A FIFO with a synchronous read port and a synchronous write port which both use the SAME clock. A FIFO with a synchronous read port and a synchronous write port which use DIFFERENT (asynchronous) clocks.

Asynchronous FIFO:

As you see, both types of FIFOs use synchronous interfaces on the read and write ports. The difference lies in whether the read clock and write clock are the same or not.

Building a Synchronous FIFO


If you decide that you need a synchronous FIFO, the building process is simple. There is a synchronous FIFO Wizard within the QuickWorks toolkit. To open the FIFO Wizard, first you must run SpDE (Start > Programs > QuickLogic > SpDE). The from SpDEs menus, select Tools > RAM/ROM/FIFO Wizard. The window shown in Figure 1 will open.

Select a Module type of FIFO. The Part type list box is not used. Then click NEXT. You will see the window of Figure 2.

QL5032 Users Guide

Here you need to select a Depth (from a list of choices), and Width (any width). Be aware that each RAM block is a little over 1K bits (1152), so the larger the FIFO you choose, the more RAM blocks you will use in the QL5032. There are 14 RAM blocks available, and you may build one or more FIFOs. The Model selection of Speed Optimized (LFSR Status Counter) is default, and that is recommended for FIFOs which need to operate at > 75 MHz on either the read or the write port. The trade-off is that the LSFR counter makes generating new FIFO flags more difficult, since you need to decode the value from the LFSR counter, instead of a normal binary counter. Contact QuickLogic Customer Engineering if you need assistance with creating different flags (other than empty or full) for a Speed Optimized FIFO. When you finish making selections, click NEXT again to bring up Figure 3.

In this window, you select the netlist format for the FIFO. Choose the language you are most confortable with, in case you want to later edit the FIFO to insert different status flags, or change to a non-standard depth. If you are familiar with neither, we suggest Verilog, since it is a more compact format.

QL5032 Users Guide

After selecting the output file format (VHDL, Verilog, or both), then click FINISH to bring up the window in Figure 4.

This lists all the files which will be written to your project folder. Select your project folder with the Save As button on this window. The most important filename in this list is the one with the Description of FIFO, since this is the top-level FIFO block which you will instantiate into your Verilog, VHDL, or Schematic file.

Building an Asynchronous FIFO


If you have chosen to build an asynchronous FIFO, you will need to download an Asynchronous FIFO design example from the QuickLogic Web Site, since the FIFO Wizard does not currently create asynchronous FIFOs. Point your Internet Browser to http://www.quicklogic.com/support/QL5032. If you have trouble locating the online asynchronous FIFO examples, then contact the QuickLogic Customer Engineering Hotline at 408-990-4100. A description of the QuickLogic Asynchronous FIFO Reference Design is given below. Asynchronous FIFOs are widely used in the computer networking industry to receive data at a particular frequency and transmit them at another frequency. An asynchronous FIFO has two different clocks: one for read and one for write. There are issues that arise when passing data over asynchronous clock boundaries. For example, when the write clock is faster than the read clock, data could be over-written and hence lost. In order to overcome these problems, control signals like almost-empty, almost-full, empty, and full flags are required. When the FIFO is full (or not empty), the POP signal indicates that the data will be read out of the RAM location indicated by the read address register, and the read address register will be incremented. When the read address register reaches the write address register, the FIFO is empty and the empty flag is active. The empty flag stays active as long as this condition is true. When the FIFO is empty (or not full), the PUSH signal indicates that the data will be written to the RAM location indicated by the write address register, and the write address register will be incremented. When the write address register reaches the read address register the FIFO is full and the full flag is active. The full flag stays active as long as this condition is true. The almost-empty flag is activated two locations before the FIFO is actually empty. Similarly, the almostfull flag is activated three locations before the FIFO is actually full. The user can initialize the read side and the write side counters to any desired value, e.g. if the counters are initialized to 7, the almost-full and almost-empty flags are activated 7 locations before the FIFO is full and empty respectively. The read side

QL5032 Users Guide

and the write side state machines receive the outputs of the comparators to determine the status of the flags. These state machines have two D-flipflops each in order to account for metastability. Keeping in mind the necessity for asynchronous FIFOs, QuickLogic generated a code for a 32x32 asynchronous FIFO. The current asynchronous FIFO design is for a 32 wide by 32 deep FIFO implemented using RAM blocks and logic cells. It is essentially a schematic (F32a32.sch), with Verilog code for the gray code counters and registers. In order to configure the FIFO with a different width or depth, some modifications need to be made. The figure below shows the file names corresponding to the relevant building blocks in the schematic. All the blocks shown in the figure can be scaled to meet the users width and depth requirements. Please note that this figure does not show the read side and write side state machines. RAM Block In the enclosed schematic, a 64X32 RAM block symbol is used. This block was created from Verilog code generated by the RAM/ROM/FIFO Wizard in SpDE. The user can specify the required width and depth of the RAM block in the wizard, which generates the Verilog/VHDL code. Using the New Block Symbol in the Schematic Tools, you can create a schematic symbol for the Verilog code.
almostfull

32x32 async fifo Waddr-Raddr


AND2i0
0 0

almostempty pop AND2i0


0

push AND2i0 Q D
1 1

AND2i0
0

full

Q D R A N D2 i1 A N D2 i1 AND2i1

DSQ
1 1

D SQ AND2i1

empty

topoff Q D AND3i2 R

OR3i0

OR3i0

flush D Q

-2
EQ A[4:0] B[4:0] ECOMP5
P2

-1

AND4i3 R D SQ

A N D2 i1

CLR EN

GREY_COUNTER5-3 Q[4:0] Waddr3[4:0] CLR EN D[4:0]

Waddr+3

EQ A[4:0] B[4:0] ECOMP5


P2

REGISTER5-2 Q[4:0]

Waddr+2
Waddr2[4:0]

CLR EN GREY_COUNTER5-2

Raddr+2
Q[4:0]

O R2 i0

O R2 i0

EQ A[4:0] B[4:0] ECOMP5


P2

Raddr2[4:0] D[4:0] CLR EN rrst rclk

wrst wclk

CLR EN

GREY_COUNTER5-0 Q[4:0]

Waddr+0

Raddr+1
EQ A[4:0] B[4:0] ECOMP5
P2

REGISTER5-1 Q[4:0]

Raddr1[4:0] EQ A[4:0] B[4:0] ECOMP5 Raddr1[4:0] gnd,Raddr1[4:0]

Waddr0[4:0] gnd,Waddr0[4:0] R64X32 wa[5:0] ra[5:0] din[31:0] wd[31:0]rd[31:0] we wclk I_3 re rclk

P2

dout[31:0]

COMPARATORS
The 5-bit comparators shown in the top level schematic are purely schematic designs generated in the schematic editor. The user must note that the number of bits to be compared would change with the width of the RAM block used. For example, if the user defined FIFO depth is 256, 8 address bits are required and hence an 8-bit comparator must be used.

GREY COUNTERS
The Grey code counters shown in the top level schematic are Verilog modules written for a 5-bit Grey counter. This code would have to be modified to 6-bits, 7-bits, 8-bits, and 9-bits for FIFO depths of 64, 128, 256, and 512 respectively. Writing a Verilog module for a Grey counter can be cumbersome, therefore we have provided an example that can generate the Grey code for such wide counters.

QL5032 Users Guide

REGISTERS
The registers are Verilog modules, which can be easily modified to fit the users depth requirements. There are two registers for the write address and one for the read address. These are used to generate adjacent count values for the determination of the FIFO status i.e. Almost-Full, Almost-Empty, Full, and Empty. This Asynchronous FIFO reference design is provided for the purpose of reducing design cycle time when using the QL5032 device.

Adding a FIFO to your Schematic Design


If you are designing in all schematics, or with a top-level schematic, then you may wish to include the FIFO in a schematic page, instead of instantiating the FIFO into an HDL file. In order to do this, you first need to create a symbol for the FIFO.

Creating a Schematic Symbol for the FIFO


There is a tool within the Schematic Editor to help you create the symbol you need for your FIFO. First, however, you will need to know the FIFO name, and all the FIFO port names. To do this, open the VHDL or Verilog file that was created in the last step of the FIFO Wizard. Below is shown a Verilog file created by the FIFO wizard. You will want to open the top-level FIFO, which will have a name like F64S36.V. Skip down to the line that starts wth the word module and you will see something like the text below. module f64s36 (pop, push, din, dout, emptyn, fulln, clk, rst); // inputs: =din[35:0]=,pop,push,clk,rst // outputs: =dout[35:0]=,emptyn,fulln input pop, push, clk, rst; input [35:0] din; output emptyn, fulln; output [35:0] dout;

The key lines here are the lines that start with //inputs and //outputs. You will copy text from these lines to create the symbol.

Running the Symbol Creation Tool


First, you need to open the Schematic Editor. This is done from the SpDE menus by making the selection Design > Edit Schematic. Then select the schematic in which you want to add the FIFO (such as the toplevel project schematic). From the Schematic Editors menus, select Add > New Block Symbol. This will open the following window (shown in Figure 5)

QL5032 Users Guide

As you can see, in this example we have already copied the appropriate information from the FIFO Verilog file into the fields in the New Block Symbol window. The module name is entered for the Block Name, and the text after //inputs and //outputs is copied into the appropriate Input Pins and Output Pins fields. You can actually use Windows Copy and Paste commands to minimize typing errors. Do not use the Use Data From This Block button. This copies all the inputs and outputs of the current schematic into the input and output fields, which is not what you want to do in this case. If you make a mistake, you can always click CANCEL and re-enter the Symbol Creation Tool with the Add > New Block Symbol menu command. When you have entered all data correctly, simply click the RUN button to create the symbol. You will them be automatically carrying the new symbol around with your cursor, and can click to drop it into an appropriate area of the schematic. The resulting symbol is shown below.

F64S36 din[35:0] pop push clk rst dout[35:0] emptyn fulln

If you want to add almost empty and almost full flags to the FIFO, you must first change the Verilog or VHDL file manually, then re-create or edit the symbol with the Symbol Editor.

QL5032 Design Flow


This chapter walks you through the design flow for a QL5032 Design. The steps that are discussed include: 1. 2. 3. 4. 5. Pre-Layout Simulation Compilation Timing Verification Post-Layout Simulation Programming a QL5032 Device

Pre-Layout Simulation
When you are designing with a QL5032 device, you may choose to simulate a small portion of your design, or the design as a whole. When you are simulating a small portion of the device, you may not want to include the PCI32 core within the simulation, in order to simplify the simulation vectors you need to create. Later, once you have created more design blocks, you may want to simulate the design as a whole,

QL5032 Users Guide

including the PCI32 core. This section describes how you would preform these types of function (prelayout) simulation.

Step 1: Create an HDL Netlist or an HDL Source File


QuickWorks contains two simulators. The Silos simulator for Verilog simulation, and the Veribest VHDL simulator for VHDL simulation. In order to use one of these simulators, you need to create a Verilog or VHDL netlist (as appropriate). If the function you with to simulate is in an HDL file not including the PCI32 core as part of the function, then you may skip this step completely and move on to step 2. If you are simulation an HDL file, but you wish to include the PCI32 core in your simulation, then you need to perform these steps before going to step 2 Copy pci32.v from \pasic\spde\data to your project folder. Rename it to your project name with a .CV file extension. Add a line at the very beginning of your top-level HDL file (or wherever the PCI32 module is instantiated) which reads: `include project_name.cv (where project_name is your top-level design name) Then go to Step 2.

If the function you wish to simulate exists at the top level as a schematic, then the first step would be to convert that top-level schematic to an HDL file, so that the simulator can read it. This is an automatic process, which uses the Hierarchy Navigator tool. First, select the Hierarchy Navigator from the Design > Navigate Hierarchy menu command in SpDE. You will be asked to select a .TRE file. If you have never used the Hierarchy Navigator before, you wont have a .TRE file, so you should hit the NEW button, which will change the window options so that you can select the top-level Schematic (.SCH) file. Once you have selected this file, the design will be loaded into the Hierachy Navigator. From the Hierarchy Navigator, you create an HDL netlist with the Tools > Export QuickLogic menu command. This will bring up a dialog box like the one below.

Make sure you select a netlist format appropriate for the simulator you are using to simulate the design. If (as in this example) you can ONLY select Verilog, that is because there were Verilog blocks located in the Hierarchy of the design. In that case, the Silos (Verilog) simulator would need to be used for simulation. The device selection is important at this stage. If you intend to simulate with the PCI32 core, then you MUST select the QL5032-33 device. If you are simulating a small function within the design and you do not want to use the PCI32 core, then select a device from the QuickRAM (QL4xxx) family, such as the

QL5032 Users Guide

QL4058. Also, choose the appropriate device package now, especially if you are ready to target the QL5032 with a full design. These setting will be used later (after simulation) in the compilation process. The first check box (Display busses as [L:H] in Waveform tools), is an option which you should never need to use, and is provided only for backward compatibility with older QuickLogic designs. The second check box (Preserve schematic structure through synthesis) has no affect on simulation, but it does affect the synthesis process for the design (described later in this chapter). This option need only be selected if you have carefully hand-crafted schematic functions using knowledge of the QuickLogic logic cell structure. See the QuickWorks Users Guide for more details. The final check box allows you to run a syntax check on the netlist you are creating. If you have HDL blocks within the hierarchy of your design that you have written yourself, you should check this box so that a syntax check can find problems before you simulate the design. Once you have selected all appropriate settings, click the OK button to create an HDL netlist. Once the process is finished, you will see a window like this:

You may click Done or View Messages if notes or warning messages were created during the netlisting process. Next, select File > Save from the Hierarchy Navigator so that all of your netlisting options are remembered for the next time.

Step 2: Create a Test Fixture (or Test Bench)


In order to simulate your design, you need to have an HDL file (called a test bench or a test fixture) which stimulates the inputs of your design in order to create a desired response. If you are simulating a piece of your QL5032 design which does not contain the PCI32 module at this time, then you will need to create this test fixture manually. If you are testing the entire QL5032 including the PCI32 module, then you can start from the PCI test-fixture template provided by QuickLogic which simulates the PCI bus and allows you to create PCI commands from a separate (simulation) Master device on the bus. Contact QuickLogic for the availability of the PCI test-fixture template. Also, you may create a Verilog test-fixture by using a graphical waveform generation tool, if desired. This will allow you to create simple waveforms. You can access the Waveform editor with the Simulation > Waveform Editor command. See the QuickWorks Users Guide for more information on using the Waveform Editor.

Step 3: Set up the Simulation Project File


If you are performing a Verilog simulation with the Silos simulator, you can use an automated interface to set up your simulation project. VHDL simulation project must be set up differently. For more information about VHDL simulation, consult the Veribest VBVHDL Users Guide. To set up your Silos Verilog Simulation, go to the Simulator Setup Window. This can be accessed from the Hierarchy Navigator (if you have a top level schematic) by clicking in the menu Simulation > Run Simulation. If your top level simulation file is not schematic, but a Verilog file, get to this window from the SpDE menus with the command Design > Run Simulation. The Simulation Setup Options window looks like the window below:

10

QL5032 Users Guide

The first two text boxes allow you to specify the location of your top level design file and test fixture (both in Verilog format. When using the QuickWorks tools, it will help to name your test-fixture with the file extension .TF. In this case, if a file is found with the same name as the design loaded into SpDE or the Hierarchy Navigator, with the .TF extension, then the first field in this window is automatically filled for you. The same is true for the second field (top level module), if there is a file with a .V extension. You may Browse your hard drive to specify the files if needed. The Add Verilog Library option allows you to specify one QuickLogic specific library to load fore the simulation. If you are simulating with the PCI32 core within the design, then the pci3233m.v file should be selected. If you chose the QL5032-33 device when you exported your Verilog netlist from the Hierarchy Navigator in Step 1 (and if you are using a top-level schematic), then this selection is already made for you, as in the window shown. Make sure that the Pre-Layout option is selected. If you have never run a post-layout simulation with the current project, then Pre-Layout will be the only option available. One you have made all selections, click OK to open the Silos Simulator.

Step 4: Run the Simulation and Check Results


The simulation is performed with the Silos Simulator. Results can be verifying by looking at Waveforms using the Silos Data Analyzer tool, or by creating test cases within your Verilog test-fixture that look for the correct response of your design and indicate with a pass or fail whether the response is valid (along with diagnostic information). For help with the Silos Simulation Tools, consult the QuickWorks Users Manual.

Compilation
Now that you have completed functional simulation, the next step is to compile the QL5032 design into a QL5032 device. This Users Guide will take you through the basic steps you should follow to compile a QL5032 design. For a more comprehensive discussion of design flows and using the QuickWorks software, refer to the QuickWorks Users Guide. Since the QL5032 design contains at least two HDL blocks (the DMA Controller and the Configuration Space and Addressing block if you used the QuickLogic reference designs), you will need to synthesize the design before placing and routing. This is accomplished automatically when using QuickWorks.

Step 1. Synthesize the design


SpDE is the primary design management tool within the QuickWorks product. Open SpDE from the Start Menu in Windows: Start > Programs > QuickLogic > SpDE.

11

QL5032 Users Guide

Note: SpDE is an acronym, pronounced Spee-dee, which stands for the Seamless pASIC Design Environment. pASIC is a trademarked acronym which stands for programmable ASIC, and is pronounced p-ace-ik. To start the synthesis tool, called Synplify-Lite, you simply need to select the menu command: File > Import Verilog, or File > Import VHDL (depending on which language you selected to use for your project). Then select the top-level design name in your project, such as myproject1.v. This will open the Synplify-Lite window. Within the Synplify-Lite synthesis window you should see fields for the device name and package. Please verify that the device name is QL5032 and the package is the package you require for your project (PQ208 for a 208 PQFP package, and PB256 for a 256 pin plastic BGA package). Then click the RUN button to compile your design. If you get any errors or warnings, you can review the errors or warnings by clicking the View Log button, which will open the Turbo Writer Editor with your error file. With the error file open, you can click on the ERR button in the Turbo Writer toolbar to scroll through your errors, while the editor takes you to the line of your source file where the problems were found. If you get warnings, and you decide after reviewing the warnings that you want to continue the compilation without changing the source files, then you should manually close the Synplify-Lite window.

Step 2. Set up your Design Constraints


Inside of the SpDE tool, you can specify Design Timing, Placement, and Pinout Constraints (although Placement Constraints are rarely if ever needed). The Constraint Editor can be opened with the menu command: Tools > Constraints > Timing (for timing constraints), and Tools > Constraints > Placement (for placement and pinout constraints). Note: When entering pinout constraints, it is important to constrain pins only on compatible pin types. There are two types of user pins on the QL5032 normal I/O pins (I/O), and global network pins (GCLK/I). Global network pins can only be used as inputs, not outputs. If you wish to place a signal on a global network pin, then you must have used a CKPAD for that pin in the schematic drawing, or in the HDL instantiation Only 6 global network (GCLK/I) pins are available in the QL5032. Please check the QuickWorks Users Guide for more information on these pins. Refer to the QuickWorks Users Guide for help with using the Constraint Editor.

Step 3. Logic Optimization, Placement, and Routing


Once the design is synthesized, and the netlist is automatically loaded into SpDE, the logic optimization, placement, and routing processing may be started. Before running these tools, open the Tools Options to verify that the tools have been set up properly. Do this with the Tools > Options command in the SpDE menu. First click on the Place and Route tab. Set the Placer Mode to Preliminary, for much faster placement time. Results are usually very good, and this saves a lot of time during development. When the project gets near completion, you can change the placer mode to Quality to see if the results improve. Quality Placement takes about 6 times longer than Preliminary Placement. Results can be from 0 to 20 percent faster, depending on the design. Before saving these options, click on the Delay Modeler tab. Select the appropriate Speed Grade (A, or the faster B speed grade). Most designs should be able to meet local clock speed requirements with the A speed grade. Either speed grade will meet all PCI specifications. Next, select the Back Annotation Tab and make sure that you have selected the appropriate simulator for timing back annotation. In the QL5032 design flow, you will choose either Silos III (for Verilog simulation) or Vital 3.0 Compliant (for VHDL simulation with the VBVHDL simulator).

12

QL5032 Users Guide

Once you verify the tools options, click Save Settings, and then Close from the Tools Options window. Then from SpDE, you can start the logic optimization, placement and routing process by going to the Tools > Run Tools menu command (or click on the Hammer in the toolbar). Make sure all tools are selected and click RUN. SpDE will then run all tools. Depending on the Placer Mode you selected (Preliminary or Quality), the size of the design, the speed of the PC, and the memory of the PC, this process can take from a couple of minutes to nearly an hour.

Step 4. Checking Design Timing


When the tools have completed, you will get a dialog box asking you if you want to view the Report File. Click YES. The report file will open. You can then check the timing results in the Timing section of the report file, which will show you if you met your design requirements. Also, once the tools have finished running, you can get a detailed performance analysis with the Path Analyzer, which can be opened with the Tools > Path Analyzer menu command. For details on using the Path Analyzer, refer to the QuickWorks Users Guide.

Post-Layout (Timing) Simulation


The post-layout simulation process is very similar to the pre-layout simulation process. Since you already have a text fixture from the pre-layout simulation, you simply need to specify the simulation options so that the post-layout files are selected for simulation instead of the pre-layout files. This is a simple process. Go to the Simulation Setup Options window. Remember, this is done by clicking on the Simulation > Run Simulation menu command in the Hierarchy Navigator, or on the Design > Run Simulation menu command in SpDE. From this window, make sure you select the Post-Layout simulation option. If this option is greyed out, then make sure the filenames are correct for top-level design, and test-fixture. If you need to enter the toplevel design name, make sure you select the file with the extension .VQ for the top level design, since this is the post-layout Verilog netlist created by SpDE. If this file is not present in your project directory, you need to make sure you select the correct simulator in Tools > Options under Back Annotation, and re-run the Back Annotation tool. No Verilog libraries need be specified for post-layout simulation of the QL5032. For more information or help with using the Silos simulator, refer to the QuickWorks Users Guide.

Programming a QL5032 Device


Once you have created a QL5032 design that meets functional and timing requirements, you may program a QuickLogic device. The programming file is created from SpDE (when you Save), and it has the extension .CHP. This file may be compressed with popular compression tools for easy emailing or transfer via floppy disk to other PCs. QuickLogic devices can only be programmed with a QuickLogic DeskFab programmer using the QuickPro programming software. Also, QuickLogic distributors can often arrange for programming larger quantities of devices. If you have questions, contact your local QuickLogic sales representative for programming options that are available to you.

13

QL5032 Users Guide

Appendix A: QL5032 PCI Interface Functional Description


The QL5032 contains a PCI master/target interface. The master interface is capable of generating read or write transactions on the PCI bus, and is capable of zero wait-state burst operations. The internal interface of the QL5032 is designed to work with a user-created DMA controller for efficiently moving blocks of data. Since the DMA controller resides in the FPGA logic and can be created by the user, advanced features such as DMA chaining or a multi-channel DMA controller can be created in the QL5032 if needed. The target interface is slower and operates with two wait states. It is meant simply for accessing the PCI configuration registers and registers in the DMA controller. For quickly moving blocks of data, it is suggested the master interface be used via a DMA controller.

Internal Port Descriptions


In the table below, signals which end in the character N should be considered active-low (for example, Mst_IRDYN).

Master
Mst_WrAd[31:0] I Address for master DMA writes. This address must be treated as valid from the beginning of a DMA burst write until the DMA write operation is complete. It must be incremented (by 4) each time data is transferred on the PCI bus. Address for master DMA reads. This address must be treated as valid from the beginning of a DMA burst read until the DMA read operation is complete. It must be incremented (by 4) each time data is transferred on the PCI bus. DMA state machine in write mode. DMA state machine in read mode. Request use of the PCI bus. One data transfer remains in the burst. Two or less data transfers remain in the burst. Data for master DMA writes (to PCI bus). Data valid on Mst_WrData[31:0]. Data receive acknowledge for Mst_WrData[31:0]. Master write pipeline is empty. Data for master DMA reads (from PCI bus). Data valid on Mst_RdData[31:0]. Master read pipeline is empty. Type of PCI read command to be used for DMA reads: 00 or 01 = Memory Read 10 = Memory Read Line 11 = Memory Read Multiple Enable Latency Counter. Set to 1 to ignore the Latency Timer in the PCI configuration space (offset 0Ch). Data was transferred on the previous PCI clock. Active during the last data transfer of a PCI master transaction. The PCI REQN signal generated by the QL5032 as PCI master. The PCI IRDYN signal generated by the QL5032 as PCI master. Target abort detected during master transaction. Target timeout detected (no response from target).

Mst_RdAd[31:0]

Mst_WrMode Mst_RdMode Mst_Burst_Req Mst_One_Read Mst_Two_Reads Mst_WrData[31:0] Mst_WrData_Valid Mst_WrData_Rdy Mst_WrBurst_Done Mst_RdData[31:0] Mst_RdData_Valid Mst_RdBurst_Done Mst_RdCmd[1:0]

I I I I I I I O O O O O I

Mst_LatCntEn Mst_Xfer_D1 Mst_Last_Cycle Mst_REQN Mst_IRDYN Mst_Tabort_Det Mst_TTO_Det

I O O O O O O

15

QL5032 Users Guide

Target
Usr_Addr_WrData[31:0] O Target address and data from target writes. During all target accesses, the address will be presented on Usr_Addr_WrData[31:0] and simultaneously, Usr_Adr_Valid will be active. During target write transactions, this port will present write data to the PCI configuration space or user logic. PCI command and byte enables. During target accesses, the PCI command will be presented on Usr_CBE[3:0] and simultaneously, Usr_Adr_Valid will be active. During target read or write transactions, this port will present active-low byte-enables to the PCI configuration space or user logic. Indicates the beginning of a PCI transaction, and that a target address is valid on Usr_Addr_WrData[31:0] and the PCI command is valid on Usr_CBE[3:0]. When this signal is active, the target address must be latched and decoded to determine if this address belongs to the devices memory space. Also, the PCI command must be decoded to determine the type of PCI transaction. On subsequent clocks of a target access, this signal will be low, indicating that data (not an address) is present on Usr_Addr_WrData[31:0]. Indicates that the target address should be incremented, because the previous data transfer has completed. During burst target accesses, the target address is only presented to the back-end logic at the beginning of the transaction (when Usr_Adr_Valid is active), and must therefore be latched and incremented (by 4) for subsequent data transfers. This signal will be active for the duration of a target write transaction, and may be used by back-end logic to turn on outputenables for transmitting the data off-chip. Active when a user read command has been decoded from the Usr_CBE[3:0] bus. This command may be mapped from any of the PCI read commands, such as Memory Read, Memory Read Line, Memory Read Multiple, I/O Read, etc. Active when a user write command has been decoded from the Usr_CBE[3:0] bus. This command may be mapped from any of the PCI write commands, such as Memory Write or I/O Write. The address on Usr_Addr_WrData[31:0] has been decoded and determined to be within the address space of the device. Usr_Addr_WrData[31:0] must be compared to each of the valid Base Address Registers in the PCI configuration space. Also, this signal must be gated by the Memory Access Enable or I/O Access Enable registers in the PCI configuration space (Command Register bits 1 or 0 at offset 04h). Write enable for data on Usr_Addr_WrData[31:0] during PCI writes. Write enable for data on Usr_Addr_WrData[31:0] during PCI configuration write transactions. Data from the PCI configuration registers, required to be presented during PCI configuration reads. Data from the back-end user logic (and/or DMA configuration registers), required to be presented during PCI reads. Data from the Command Register in the PCI configuration space (offset 04h). Data from the Latency Timer in the PCI configuration space (offset 0Ch).

Usr_CBE[3:0]

Usr_Adr_Valid

Usr_Adr_Inc

Usr_WrReq

Usr_RdDecode

Usr_WrDecode

Usr_Select

Usr_Write Cfg_Write Cfg_RdData[31:0] Usr_RdData[31:0] Cfg_CmdReg8 Cfg_CmdReg6 Cfg_LatCnt[7:0]

O O I I I I

16

QL5032 Users Guide

Usr_MstRdAd_Sel Usr_MstWrAd_Sel Cfg_PERR_Det

I I O

Cfg_SERR_Sig

Cfg_MstPERR_Det

Usr_TRDYN Usr_STOPN Usr_Devsel Usr_Last_Cycle_D1 Usr_Stop Usr_Interrupt PCI_clock PCI_reset PCI_IRDYN_D1 PCI_FRAMEN_D1 PCI_DEVSELN_D1 PCI_TRDYN_D1 PCI_STOPN_D1 PCI_IDSEL_D1

O O O O I I O O O O O O O O

Master Read Address from the DMA configuration registers. Master Write Address from the DMA configuration registers. Parity error detected on the PCI bus. When this signal is active, bit 15 of the Status Register must be set in the PCI configuration space (offset 04h). System error asserted on the PCI bus. When this signal is active, the Signalled System Error bit, bit 14 of the Status Register, must be set in the PCI configuration space (offset 04h). Data parity error detected on the PCI bus by the master. When this signal is active, bit 8 of the Status Register must be set in the PCI configuration space (offset 04h). Copy of the TRDYN signal as driven by the PCI target interface. Copy of the STOPN signal as driven by the PCI target interface. Inverted copy of the DEVSELN signal as driven by the PCI target interface. Last transfer in a PCI transaction is occurring. Used to prematurely stop a PCI target access on the next PCI clock. Used to signal an interrupt on the PCI bus. PCI clock. PCI reset signal. Copy of the IRDYN signal from the PCI bus, delayed by one clock. Copy of the FRAMEN signal from the PCI bus, delayed by one clock. Copy of the DEVSELN signal from the PCI bus, delayed by one clock. Copy of the TRDYN signal from the PCI bus, delayed by one clock. Copy of the STOPN signal from the PCI bus, delayed by one clock. Copy of the IDSEL signal from the PCI bus, delayed by one clock.

PCI
AD[31:0] CBEN[3:0] INTAN SERRN PERRN PAR REQN DEVSELN TRDYN STOPN FRAMEN IRDYN IDSEL GNTN CLK RSTN B B O O B B O B B B B B I I I I PCI Address/Data bus. PCI Command/Byte Enable bus. PCI Interrupt. PCI System Error. PCI Parity Error. PCI Parity signal for AD[31:0] and CBEN[3:0]. PCI bus request. PCI device select. PCI target ready. PCI stop. PCI frame. PCI initiator (master) ready. PCI ID select, for configuration accesses. PCI bus grant. PCI clock. PCI reset.

17

QL5032 Users Guide

Waveforms
Single-Dword Configuration Read

On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI host controller (master) has asserted IDSEL and provided a PCI configuration register address on AD[31:0] and a configuration read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has deasserted FRAMEN to indicate that it only wishes to perform one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction. TRDYN is deasserted, indicating a wait-state inserted by the target. Clock edges 5 through 7 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 8 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to only read one register location, FRAMEN is seen deasserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer has occurred on the PCI bus. Clock edge 9 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.

18

QL5032 Users Guide

Burst Configuration Read

On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI host controller (master) has asserted IDSEL and provided a PCI configuration register address on AD[31:0] and a configuration read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction. TRDYN is deasserted, indicating a wait-state inserted by the target. Clock edges 5 through 7 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 8 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to read more than one register location, FRAMEN is seen still asserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer has occurred on the PCI bus. Clock edges 9 through 12 are wait-states since TRDYN is deasserted. Since FRAMEN is deasserted on the PCI bus, only one more data transfer will take place. Prior to clock 12, the next double-word of read data must be presented to Cfg_RdData[31:0]. Clock edge 13 represents the last data transfer in this transaction. FRAMEN is deasserted, and both IRDYN and TRDYN are asserted. On the Interface Ports, Usr_Adr_Inc is active. Clock edge 14 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.

19

QL5032 Users Guide

Single-Dword Configuration Write

On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. IDSEL is active, indicating that this will be a configuration access. The value on CBEN[3:0] represents the configuration write command. The value on AD[31:0] is 10h, representing Base Address Register 0 in the PCI configuration space. Clock edge 3 is a wait state inserted by the QL5032. The QL5032 contains a medium-speed target, so it isnt able to claim PCI transactions on the clock after the address cycle (clock edge 2). Here the PCI host controller is seen driving the data that it wishes to write to the configuration register at offset 10h, along with the byte enables on CBEN[3:0]. It has also deasserted FRAMEN, indicating that it only wants to perform one data transfer. On the interface ports to the FPGA logic, the PCI core presents the address and command from clock edge 2 on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. It also asserts Usr_Adr_Valid to indicate that the value on Usr_Addr_WrData[31:0] is an address, and that a new PCI transaction is beginning. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Cfg_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. This is the last data transfer in the transaction since FRAMEN is deasserted. Clock edge 7 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction.

20

QL5032 Users Guide

Burst Configuration Write

On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. IDSEL is active, indicating that this will be a configuration access. The value on CBEN[3:0] represents the configuration write command. The value on AD[31:0] is 04h, representing the Status/Command Register in the PCI configuration space. Clock edge 3 is a wait state inserted by the QL5032. The QL5032 contains a medium-speed target, so it isnt able to claim PCI transactions on the clock after the address cycle (clock edge 2). Here the PCI host controller is seen driving the data that it wishes to write to the configuration register at offset 04h, along with the byte enables on CBEN[3:0]. It has also kept FRAMEN active, indicating that it wants to perform more one data transfer. On the interface ports to the FPGA logic, the PCI core presents the address and command from clock edge 2 on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. It also asserts Usr_Adr_Valid to indicate that the value on Usr_Addr_WrData[31:0] is an address, and that a new PCI transaction is beginning. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Cfg_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. Since FRAMEN is still asserted, at least one more data transfer will occur. Clock edges 7 and 8 are wait states inserted by the PCI target in the QL5032, since TRDYN is deasserted. On clock edge 9 both IRDYN and TRDYN are asserted, indicating the target has accepted the second piece of data. Usr_Adr_Inc is asserted on the Interface Ports, indicating that the write to the PCI configuration register should take

21

QL5032 Users Guide

place. Note that the first write occurred to offset 04h in the PCI configuration space on clock edge 6, and that the address for that register was provided at the beginning of the PCI transaction. For this second data transfer, it is implied that the write occurs to the next double-word in the PCI configuration space, offset 08h (the Class Code/Revision ID register). For this reason, the address provided at the beginning of the PCI access must be stored and automatically incremented (by 4) each time Usr_Adr_Inc is active. Note that offset 08h in the PCI configuration space is a read-only register, and that the write operation will not overwrite any data. Clock edge 10 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction.

Target Read

On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI master provided a PCI memory address on AD[31:0] and a read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. As soon as Usr_Adr_Valid is active, the PCI address needs to be decoded and Usr_Select asserted if the address belongs to the QL5032s memory space. Also, Usr_RdDecode must be asserted for any target read command appearing on Usr_CBE[3:0]. Usr_RdDecode may be mapped from any of the PCI read commands, which include memory read, memory read line, and memory read multiple for memory base addresses, and I/O read for I/O base addresses. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction, since Usr_Select and Usr_RdDecode is seen asserted before this clock edge. TRDYN is deasserted, indicating a wait-state inserted by the target.

22

QL5032 Users Guide

Clock edges 5 and 6 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 7 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to read more than one register location, FRAMEN is seen still asserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer occurring on the PCI bus. Clock edges 8 through 10 are wait-states since TRDYN is deasserted. Since FRAMEN is deasserted on the PCI bus, only one more data transfer will take place. Prior to clock 10, the next double-word of read data must be presented to Cfg_RdData[31:0]. Clock edge 11 represents the last data transfer in this transaction. FRAMEN is deasserted, and both IRDYN and TRDYN are asserted. On the Interface Ports, Usr_Adr_Inc is active. Clock edge 12 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.

Target Write

On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. The first PCI address for this transaction is present on AD[31:0] and the memory write command is present on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on

23

QL5032 Users Guide

Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. As soon as Usr_Adr_Valid is active, the PCI address needs to be decoded and Usr_Select asserted if the address belongs to the QL5032s memory space. Also, Usr_WrDecode must be asserted for any target write command appearing on Usr_CBE[3:0]. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Usr_WrReq has been asserted. This signal will stay asserted for the duration of the PCI transaction, and may be used to turn on outputenables if the data will be transmitted off-chip. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Usr_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. Since FRAMEN is still asserted, at least one more data transfer will occur. Clock edges 7 and 8 are wait states inserted by the PCI target in the QL5032, since TRDYN is deasserted. On clock edge 9 both IRDYN and TRDYN are asserted, indicating the target has accepted the second piece of data. Usr_Adr_Inc is asserted on the Interface Ports, indicating that the write to the PCI configuration register should take place. Note that the first write occurred to memory location 100h on clock edge 6, and that the address for that register was provided at the beginning of the PCI transaction. For this second data transfer, it is implied that the write occurs to the next double-word in the memory space, at 104h. For this reason, the address provided at the beginning of the PCI access must be stored and automatically incremented (by 4) each time Usr_Adr_Inc is active. Clock edge 10 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction. Usr_WrReq is deasserted after clock edge 10.

24

QL5032 Users Guide

Master DMA Read

Previous to clock edge 1 Mst_Burst_Req and Mst_RdMode have been asserted, and a PCI address provided to the Mst_RdAd[31:0] interface port. The QL5032 responds by asserting its REQN output to request use of the PCI bus. At clock edge 2 the arbiter has granted the PCI bus to the master by asserting its GNTN input. At clock edge 3 the master is seen driving the PCI address and command on AD[31:0] and CBEN[3:0]. It is performing address stepping, so it has not yet asserted its FRAMEN signal. Clock edge 4 represents the beginning of a PCI transaction. FRAMEN is seen asserted and AD[31:0] and CBEN[3:0] are valid. Clock edge 5 is a wait-state inserted by the target because it has not yet claimed the transaction by asserting DEVSELN. The first double-word of data is read at clock edge 6 as the data is driven on AD[31:0] and both IRDYN and TRDYN are active. At clock edge 7 the second double-word of data is transferred on the PCI bus. On the interface ports, Mst_Xfer_D1 is active, indicating that data was transferred on the PCI bus on the previous clock, and that the read address should be incremented by the DMA controller. This value is shown on Mst_RdAd[31:0]. It is not used except at the very beginning of the PCI transaction, but must be kept current in case the DMA operation is interrupted and must be split across multiple PCI transactions. Clock edges 8 and 9 represent double-word transfers on the PCI bus. At clock edge 9 on the interface ports, the read data from the first data transfer is now present on Mst_RdData[31:0]. Mst_RdData_Valid is active to indicate that the data is valid.

25

QL5032 Users Guide

Clock edges 10 through 13 represent more data transfers in this PCI transaction. At clock edge 14 Mst_Two_Reads is seen active, indicating that the DMA controller needs to perform two or fewer transfers. This causes FRAMEN to be deasserted on the PCI bus after clock edge 14. On clock edge 15 the last data transfer on the PCI bus occurs because FRAMEN is inactive while IRDYN and TRDYN are active. Mst_One_Read is active to indicate that the DMA controller needs to perform only one more transfer. This last transfer occurred on clock edge 15 on the PCI bus. Mst_Last_Cycle is active to indicate that this PCI transaction is ending. Clock edge 16 represents the turn-around cycle on the PCI bus. The read pipeline continues to be cleared through clocks 17 and 18, as data is present on Mst_RdData[31:0] and Mst_RdData_Valid is active.

26

QL5032 Users Guide

Master DMA Write

Previous to clock edge 1 Mst_Burst_Req and Mst_WrMode have been asserted, and a PCI address provided to the Mst_WrAd[31:0] interface port. The QL5032 responds by asserting its REQN output to request use of the PCI bus. At clock edge 2 the arbiter has granted the PCI bus to the master by asserting its GNTN input. At clock edge 3 the master is seen driving the PCI address and command on AD[31:0] and CBEN[3:0]. It is performing address stepping, so it has not yet asserted its FRAMEN signal. Clock edge 4 represents the beginning of a PCI transaction. FRAMEN is seen asserted and AD[31:0] and CBEN[3:0] are valid. Clock edge 5 is a wait-state inserted by the target because it has not yet claimed the transaction by asserting DEVSELN. The first double-word of write data is presented on AD[31:0]. On the interface ports, a double-word of data is placed into the write pipeline since data is valid on Mst_WrData[31:0] and Mst_WrData_Valid and Mst_WrData_Rdy are active. Note that in the waveforms shown above, this is the third double-word that was written to the write pipeline. The first two double-words were transferred at an earlier time. On clock edge 6 the first double-word of data is transferred to the target, as IRDYN and TRDYN are both active. This causes Mst_Xfer_D1 to go active after clock edge 6. At clock edge 7 the second double-word of data is transferred on the PCI bus. On the interface ports, Mst_Xfer_D1 is active, indicating that data was transferred on the PCI bus on the previous clock, and that the write address should be incremented by the DMA controller. This value is shown on Mst_WrAd[31:0]. It is not used except at the very beginning of the PCI transaction, but must be kept current in case the DMA operation is interrupted and must be split across multiple PCI transactions.

27

QL5032 Users Guide

Clock edges 8 and 9 represent double-word transfers on the PCI bus. On the interface ports, write data is sent into the write pipeline. Mst_WrData_Rdy is generated to acknowledge that data has been accepted, and that the next double-word of data can be sent. If there are wait-states on the PCI bus such that the write pipeline is full and is not ready to accept new data, Mst_WrData_Rdy will be inactive, as is the case on clock edge 6.

Clock edges 10 through 13 represent more data transfers in this PCI transaction. At clock edge 14 the last double-word of write data is transferred to the write pipeline. On clock edges 15 and 16 the last two data transfers on the PCI bus occur. Mst_Last_Cycle is active on clock edge 16 to indicate that this PCI transaction is ending. Clock edge 17 represents the turn-around cycle on the PCI bus.

28

QL5032 Users Guide

Appendix B: DMA Controller Reference Design


The PCI32 core built into the QL5032 device allows burst-level control of PCI Mastering. However, since either the target agent or a master burst time-out event may interrupt a PCI burst, a DMA Controller is necessary to set up long bursts, and keep track of the start address for successive bursts. Typically, a system-level software driver would set up DMA transactions by writing to DMA Registers within the DMA Controller. The QL5032 allows total flexibility to the designer in the area of DMA Control. In order to speed up the design cycle and to present minimal complications to the designer, QuickLogic has included a reference DMA Controller design. This design may be used as-is or modified according to the exact needs of the application.

Summary of DMA Features


The Reference DMA Design provided by QuickLogic contains the following features. By enhancing the reference DMA controller, the designer may add to this list of features, or take non-essential features out of the design to reduce size. PCI Memory Mapped Registers to set up DMA transfers via the Target Interface 32-bit DWORD Write Address and Read Address Registers (for DWORD-aligned start addresses) 16-bit DWORD Read and Write Length Registers for up to (2^16)-1 transfers 8-bit DWORD PCI burst counter, for up to 255 DWORD transfers per burst on PCI Latency Timer Ignore Option, for bypassing the PCI specification which requires a master to give up the bus when its Latency Timer expires and Grant is not asserted. Support for setting up a Read and Write DMA transaction at the same time. DMA Controller will arbitrate for both transfers until they are both complete. Supports use of Memory Read Line and Memory Read Multiple commands for PCI Read DMA operations.

DMA Operation Overview


A typical DMA session will be executed as follows: 1. The Driver will use a Target Write command to set up the DMA Read Address, DMA Write Address, DMA Read Length, and DMA Write Length. Read and Write Lengths can be up to (2^16)-1 DWORDs. The Driver will use a Target Write to enable the DMA transaction by turning on the DMA Write Enable Bit and/or the DMA Read Enable Bit (and possibly the DMA Read Mode bits) The DMA Controller will then drive the PCI bus connected to the QL5032 to execute the DMA Read or DMA Write transactions, depending on whether burst data is ready to be sent or received. DMA Read is given priority if both are ready. DMA continues by executing PCI bursts of up to 255 DWORDs at a time, until both read and write transactions are completed. The DMA Controller sets the DMA Write Enable and DMA Read Enable Bits to 0 when the DMA has completed, or an error has occurred (such as a Target Abort or Target Time Out). If an error has occurred, the DMA Controller sets the DMA Read or Write Error bits appropriately. The Driver polls the QL5032 with Target Read transactions to check the DMA status. When it sees the DMA Enable bit(s) set to zero, it recognizes the end of the DMA transaction. When needed, the PCI Host (Driver) sets up the next DMA transaction.

2. 3.

4. 5. 6. 7. 8.

Description of Inputs and Outputs


The inputs and outputs of the DMA Controller are described below. These signals are present in the DMACNTRL.V (Verilog) and DMACNTRL.VHD (VHDL) files.

29

QL5032 Users Guide

Inputs
Name PCI_reset PCI_clk Usr_CBE[3:0] Usr_Ad[8:2] Usr_WrData[31:0] Mst_Last_Cycle Mst_Tabort_Det Mst_Xfer_D1 Mst_TTO_Det Usr_Write BusMstEn RdRdy WrRdy LastWr Mst_WrBurst_Done Mst_RdBurst_Done Mst_WrData_Rdy Description Asynchronous Active High Reset. Buffered version of PCI Reset PCI Clock Signal Byte Enables for DataIn[31:0] PCI Address for DMA Register Writes PCI Data for DMA Register Writes Signals Last Cycle of a PCI Master Burst is in progress Target Abort Detected on a Master Transaction (Error Condition) Master has successfully transferred a DWORD to the Target Target did not respond to Master Transaction (Error Condition) Signals valid data on Usr_WrData for a DMA Register Write Bus Mastering has been enabled in the Configuration Space Ready for a Read Burst (could be Read FIFO empty) Ready for a Write Burst (could be Write FIFO full) The last DWORD has been passed for a Write (Burst must end) Indicates that the PCI Write transaction has completed Indicates that the PCI Read transaction has completed Indicates that new data is being loaded for PCI Write

Outputs
Name Mst_RdAd[31:0] Mst_WrAd[31:0] Usr_RdData[31:0] Mst_RdCmd[1:0] Mst_WrData_Valid DMA_WrEn DMA_RdEn LocalEn LatCntEn Mst_WrMode Mst_RdMode Mst_Burst_Req Mst_One_Read Mst_Two_Reads Description Current Address for DMA Reads Current Address for DMA Writes DMA Register output for PCI Target Reads Specifies type of read command to use on PCI Master Read Cycles Ready to Write new Data DMA Write Enable Register and Bus Master Enabled DMA Read Enable Register and Bus Master Enabled DMA Register Bit for Local Enable (not used in the controller) Register used to drive the Latency counter enable on the PCI core DMA Write is in progress DMA Read is in progress DMA Controller Requests the PCI bus One Reads Remains in the DMA Read Operation Two Reads Remain in the DMA Read Operation

Description of Registers
The DMA Reference Design contains many registers that have different functions. In order to aid in the understanding of the DMA Controller, and to allow edits to be made to change functionality, a detailed description of the registers and their purposes is given in Table 1. Register Name Description Default PCI Target Memory Mapped Address Not Applicable Not Applicable

RdCnt[7:0] WrCnt[7:0]

Burst Read DWORD Counter for PCI Bursts. Burst Write DWORD Counter for PCI Bursts.

30

QL5032 Users Guide

DMAWrCnt[15:0]

DMARdCnt[15:0]

DMAWrBase[31:18] + DMAWrAdr[17:2]

DMARdBase[31:18] + DMARdAdr[17:2]

RdBCnt[7:0 (combinatorial value, not a register) WrBCnt[7:0] (combinatorial value, not a register)

Current DWORD Counts remaining in the DMA Write Transaction. Loaded by Driver to initial count, and decremented when new Write Data is written to the PCI core (Mst_WrData_Valid is high). Current DWORD Counts remaining in the DMA Read Transaction. Loaded by Driver to initial count, and decremented when a Read transaction occurs on the PCI Bus. Current Write Address for PCI Write Transactions. Start Address loaded as a DWORD address, but Read as a Byte Address. Start Address must be aligned to size of DMA transfer. Incremented by 1 DWORD when new write data is written to the PCI core (Mst_WrData_Valid). Current Read Address for PCI Read Transactions. Start Address loaded as a DWORD address, but Read as a Byte Address. Start Address must be aligned to the size of the DMA transfer. Incremented by 1 DWORD when each PCI read transaction occurs. (Mst_Xfer_D1) Holds the next value to be loaded into RdCnt. Lower of the maximum burst length (Burst_Length) or remaining Reads in DMA (DMARdCnt) Holds the next value to be loaded into WrCnt. Lower of the maximum burst length (Burst_Length) or remaining Reads in DMA (DMARdCnt)

0x100 bits 31 to 16

0x100 bits 15 to 0

0x104

0x108

Not Applicable

Not Applicable

Detailed Description of the DMACTRL.V file (Verilog)


To follow along with this description, load the DMACNTRL.V file into the Turbo Writer editor. If you cannot find this file in your project directory, then go to the Chapter titled: Setting up your QL5032 Project. This description applies to version 1.0 of this file. You can compare the version number at the top of the file. For a description of a later version of DMACNTRL.V, consult the latest Users Guide, or the notes in the comments located at the top of the file.
timescale 1ns/1ns // // // // // // dmacntrl module: DMA Controller Reference Design Version 1.0 See the QL5032 Users Guide for a detailed description of functionality. Copyright 1999: QuickLogic Corporation Last Edited 2/10/99 - QuickLogic Design Center

This first section starts with a timescale directive for the Simulator, and contains Verilog comments for the description of this file, as well as version information.
module dmacntrl ( PCI_reset, // PCI Reset, active high & asynchronous PCI_clk, // PCI Clk (33 MHz) Usr_CBE, // Registered PCI CBE signals [3:0] Usr_Ad, // PCI Target Address [8:2] Usr_WrData, // Registered PCI Data Signals [31:0] Usr_RdDataIn, // Data for User Reads not decoded within this block [31:0]

31

QL5032 Users Guide

Mst_Last_Cycle, // High only on the last data phase of a master transfer Mst_Tabort_Det, // Target aborting transfer this cycle Mst_Xfer_D1, // Delayed XFER Detected on PCI Mst_TTO_Det, // Target did not assert DEVSEL in time Usr_Write, // Write Data on PCI pins addressed to DMA Registers BusMstEn, // PCI Config Command Bit 2 (bus mastering enabled) RdRdy, // Read FIFO has room for data from PCI WrRdy, // Write FIFO is ready to send data to PCI LastWr, // End of Packet Signal from FIFO Mst_WrBurst_Done, // Write Pipeline is clear (including output register) Mst_RdBurst_Done, // Read Pipeline is clear Mst_WrData_Rdy, // Write Pipeline is ready for new data Mst_RdAd, // For Target Reads of DMA registers [31:0] Mst_WrAd, // For Target Reads of DMA registers [31:0] Usr_RdData, // For Target Reads of DMA registers [31:0] Mst_RdCmd, // Specified the PCI Read Command to Use [1:0] Mst_WrData_Valid, // In active write state (not flushing pipeline) DMAWrEn, // Software controlled enable for the write FIFO DMARdEn, // Software controlled enable for the read FIFO LocalEn, // Software controlled enable for back-end target accesses Mst_WrMode, // DMA Burst State Machine is in a Write (PCI Write) state Mst_RdMode, // DMA Burst State Machine is in a Read (PCI Read) state Mst_Burst_Req, // Tells master to assert REQN Mst_One_Read, // one read remains in burst Mst_Two_Reads, // two reads remain in burst MstRdAd_Sel, // Address for Master Read Address has been selected MstWrAd_Sel, // Address for Master Write Address has been selected Mst_LatCntEn // Enables Latency Counter for Master Transactions ); input input input input input input input PCI_reset,PCI_clk, Mst_Xfer_D1, Mst_Last_Cycle; BusMstEn, Usr_Write; [8:2] Usr_Ad; [3:0] Usr_CBE; [31:0] Usr_WrData, Usr_RdDataIn; Mst_Tabort_Det, Mst_TTO_Det, RdRdy, WrRdy; LastWr, Mst_WrBurst_Done, Mst_RdBurst_Done, Mst_WrData_Rdy; Mst_WrMode, Mst_RdMode, Mst_Burst_Req, Mst_WrData_Valid; [31:0] Mst_RdAd, Mst_WrAd, Usr_RdData; DMAWrEn, DMARdEn, LocalEn, Mst_LatCntEn; Mst_One_Read, Mst_Two_Reads, MstRdAd_Sel, MstWrAd_Sel; [1:0] Mst_RdCmd;

output output output output output

This section contains the standard Verilog module declaration with a list of ports, followed by the port directions. Notice the way that bus ports are declared. Also notice that the module name matches the filename. This is a recommended practice when using QuickWorks.
parameter Burst_Length = 255; // set to desired PCI burst length (0-255) // DMA Burst State Machine (one-hot) parameter idle = 5b00001; parameter dma_wr = 5b00010; parameter dma_wr_wt = 5b00100; parameter dma_rd = 5b01000; parameter dma_rd_wt = 5b10000; parameter idle_bt=0; parameter dma_wr_bt=1; parameter dma_wr_wt_bt=2; parameter dma_rd_bt=3; parameter dma_rd_wt_bt=4; reg [4:0] DMASm, NxtDMASm;

This next section of parameters is the setup for the DMA Burst State Machine, which is located and described later in this file. The Burst_Length parameter maybe set from 0 to the maximum value of the Burst_Cnt counter (255). It controls how many 0-wait state transfers make up a PCI burst while executing the DMA operation. The DMA Burst State machine has five states. The parameters declared here set up

32

QL5032 Users Guide

five parameters to be used to set the next state, and five parameters used to check which state the state machine is in. Two registers are declared: DMASm and NxtDMASm. However, DMASm is a hardware register (made from flip-flops), while NxtDMASm is just a reg in Verilog terms, and used to prepare the next state of the state machine. The DMA state machine is set up as a one-hot state machine (1 bit high per state). So in the idle state, only bit 0 is high.
wire wire wire wire wire wire wire wire wire wire reg reg reg reg reg reg PCI_reset,PCI_clk; [8:2] Usr_Ad; [3:0] Usr_CBE; [31:0] Usr_WrData, Mst_RdAd, Mst_WrAd, DMACntReg, DMACtrlStat; [15:0] DMARdCnt,DMAWrCnt; [7:0] Burst_Cnt, RdBCnt, WrBCnt, Init_BCnt; BCnt_eq_1, BCnt_eq_2, BCnt_eq_3, Mst_One_Read, Mst_Two_Reads; Usr_Write, PCIWrEn, PCIRdEn, Mst_RdMode; RdRdy, WrRdy, LastWr, Ld_BCnt, LdWrCnt, LdRdCnt; LdRdAdrCnt, LdWrAdrCnt, DecDMARdCnt, Dec_BCnt, IncWrAdr, IncRdAdr;

[31:0] Usr_RdData; [29:16] DMARdBase,DMAWrBase; [1:0] Mst_RdCmd; Mst_WrData_Valid, MstRdAd_Sel, MstWrAd_Sel, WrCnt0, RdCnt0; DMAWrEn, DMARdEn, DMAWrErr, DMARdErr, LocalEn, Mst_LatCntEn; WrCtrl, WrRdAdr, WrWrAdr, WrDMACnt, Mst_WrMode, Mst_Burst_Req, DecDMAWrCnt;

This next section only consists of required wire and reg declarations according to the Verilog language syntax. All signals used in the file need to be declared as a reg or as a wire. See a Verilog reference book if you want to know more about these kinds of declarations.
// DMA Register Read Addressing always @(Usr_Ad or DMACtrlStat or DMACntReg or Usr_RdDataIn) casex (Usr_Ad) 7b1xxx000: Usr_RdData <= DMACntReg; // address 0x100 7b1xxx011: Usr_RdData <= DMACtrlStat; // address 0x10C default: Usr_RdData <= Usr_RdDataIn; endcase always @(Usr_Ad) begin MstWrAd_Sel <= 1b0; MstRdAd_Sel <= 1b0; casex (Usr_Ad) 7b1xxx001: MstWrAd_Sel <= 1b1; 7b1xxx010: MstRdAd_Sel <= 1b1; endcase end

// address 0x104 // address 0x108

The DMA Register Read Addressing section controls the outputs associated with PCI Target Reads of the DMA Registers. The first always block ports the correct register to the output (Usr_RdData) based on the current address (Usr_Ad[8:2]). If neither the Control/Stat or Read Count/Write Count register are addressed, then the DMA block routes the Usr_RdDataIn bus to the output. The designer can therefore decode other addresses and provide data to the Usr_Rd_DataIn input port of the DMA block, if additional Target Read addressing is required. A casex statement is used so that the xs in the specified address can be treated as dont cares which minimizes the logic. Care should be taken if you wish to add additional addresses in the 0x100-0x1FF range (i.e. xs should be changed to 0s). The second always block in this section sets the two outputs called MstWrAd_Sel and MstRdAd_Sel, based on the PCI Target Read Address. These outputs tell the PCI core to respond to the Target Read request with the Master Write or Master Read Address, which are present on the PCI32 input ports called Mst_WrAd and MstRdAd (respectively). In this reference design, these outputs are mapped to the addresses 0x104 and 0x108. For more information, see the PCI core technical description discussion of these ports.
// DMA Register Write Addressing always @(posedge PCI_clk or posedge PCI_reset) begin

33

QL5032 Users Guide

if (PCI_reset) begin WrCtrl <= 0; WrRdAdr <= 0; WrWrAdr <= 0; WrDMACnt <= 0; end else begin WrCtrl <= 0; WrRdAdr <= 0; WrWrAdr <= 0; WrDMACnt <= 0; if (Usr_Write) casex (Usr_Ad) 7b1xxx011: WrCtrl <= 1; 7b1xxx010: WrRdAdr <= 1; 7b1xxx001: WrWrAdr <= 1; 7b1xxx000: WrDMACnt <= 1; endcase end end

// // // //

address address address address

0x10C 0x108 0x104 0x100

This always block, titled DMA Register Write Addressing, is used to create the internal control signals that will be used to write new data to the DMA Registers. WrCnrl is used to load the Control/Status Register, WrRdAdr is used to load a new Read Start Address, WrWrAdr is used to load a new Write Start Address, and WrDMACnt is used to load new DMA Read and Write Counts. A casex statement is used so that the xs in the specified address can be treated as dont cares which minimizes the logic. The write/load control signals created by this block are registered, since they will have a higher fanout.
// Load the upper (static) portion of the Write Address always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMAWrBase[29:16] <= 18b0; else begin if (WrWrAdr && !Usr_CBE[3]) DMAWrBase[29:24] <= Usr_WrData[29:24]; if (WrWrAdr && !Usr_CBE[2]) DMAWrBase[23:16] <= Usr_WrData[23:16]; end // Load the Upper (Static) portion of the Read Address always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMARdBase[29:16] <= 18b0; else begin if (WrRdAdr && !Usr_CBE[3]) DMARdBase[29:24] <= Usr_WrData[29:24]; if (WrRdAdr && !Usr_CBE[2]) DMARdBase[23:16] <= Usr_WrData[23:16]; end

This area of the DMA Controller is responsible for loading new values into the Start Address Registers (both Read and Write). Also, the lower portion of the Read and Write Address Registers are also counters, which count up by 4 each time data is transferred on the PCI bus (32-bits of data = 4 bytes). The first always block loads the upper bits of the Write Address, which will not change during the DMA Write operation. These are loaded into a Register called DMAWrBase, which is later mapped to the Mst_WrAd output port. The second always block does the exact same function, but for the Read Address. See how the correct byte enables (Usr_CBE) are checked before writing the base addresses.
// Instantiate address incrementers assign IncRdAdr = Mst_Xfer_D1 & Mst_RdMode; assign IncWrAdr = Mst_Xfer_D1 & Mst_WrMode; assign LdRdAdrCnt = (WrRdAdr & !Usr_CBE[0] & !Usr_CBE[1]); assign LdWrAdrCnt = (WrWrAdr & !Usr_CBE[0] & !Usr_CBE[1]); ucount16 RdAdrReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(IncRdAdr), .LOAD(LdRdAdrCnt), .D(Usr_WrData[15:0]), .Q(Mst_RdAd[17:2])); ucount16 WrAdrReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(IncWrAdr), .LOAD(LdWrAdrCnt), .D(Usr_WrData[15:0]),.Q(Mst_WrAd[17:2])); assign Mst_RdAd[31:18] = DMARdBase[29:16]; assign Mst_RdAd[1:0] = 2h0;

34

QL5032 Users Guide

assign Mst_WrAd[31:18] = DMAWrBase[29:16]; assign Mst_WrAd[1:0] = 2h0;

The first four assign statements set up the control signals for loading and incrementing the Read and Write Address Counters. The IncRdAdr signal enables the Read Address Counter to increment. This signal goes active when data transfers on the PCI bus during Master Mode, while the DMA Controller is in a read state. The IncWrAdr signal performs the same function with the Write Address Counter. The LdRdAdrCnt and LdWrAdrCnt signals are used to load the Read and Write Address Counters. They are generated from the WrRdAdr and WrWrAdr signals explained earlier, qualified with the appropriate byte enables. The next two ucount16 instantiations are the read and write address counters. The PCI data bits 15 to 0 (Usr_WrData[15:0]) are mapped to the Address Bits 17 to 2 (Mst_WrAd[17:2] or MstRdAd[17:2]). This is because the Read and Write Addresses are Loaded as DWORD addresses, but read as byte addresses. Since the lowest bit of the counters is bit 2 of the output addresses, then the addresses increment by 4 on each count. The final assign statements in this section map the base addresses (DMARDBase and DMAWrBase) to the upper bits of the read and write address busses. Also, the lower two bits of the Read and Write Addresses are assigned to be always 0, since all the PCI transfers occur on DWORD address boundaries, although the address is represented as a byte address.
// Instantiate Down Counters for DMA Read and Write Count assign LdWrCnt = WrDMACnt & !Usr_CBE[0] & !Usr_CBE[1]; assign LdRdCnt = WrDMACnt & !Usr_CBE[2] & !Usr_CBE[3]; assign DecDMARdCnt = IncRdAdr; always @(posedge PCI_clk) DecDMAWrCnt <= Mst_WrData_Rdy; dcount16 RdCntReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(DecDMARdCnt), .LOAD(LdRdCnt), .D(Usr_WrData[31:16]), .Q(DMARdCnt[15:0])); dcount16 WrCntReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(DecDMAWrCnt), .LOAD(LdWrCnt), .D(Usr_WrData[15:0]), .Q(DMAWrCnt[15:0])); assign DMACntReg = {DMARdCnt, DMAWrCnt};

This section deals with the loading and decrementing of the DMA Count Registers. They are loaded with Target Writes to address 0x100. The upper 16 bits are loaded into the Read Counter and the lower 16 bits are loaded into the Write Counter. LdRdCnt and LdWrCnt handle the loading of these two counters. The appropriate byte enables must be active when writing to these registers, as you can see in the first two assign statements. The third assign statement, and following always block create the decrement signals for the two counters. The Read Counter is decremented with the same signal used to increment the read counter, IncRdAdr. The Write Counter, however, is decremented differently. In the case of the Write Address Counter, the increment is handled by waiting for transactions to occur on the PCI bus. This is because the address must always be kept current to the PCI bus transaction. However, for writes, the data is considered committed once it is sent to the PCI core from the back end. Therefore, the decrement for the write counter uses the signal from the PCI core that indicates it requires new valid data on the current clock cycle: Mst_WrData_Rdy. See the description of the PCI32 module for more information on this signal. Two 16-bit down counters are instantiated for the Read Counter and Write Counter respectively.
// DMA Control Status Register Write always @(posedge PCI_clk or posedge PCI_reset) begin if (PCI_reset) begin DMAWrEn <= 0; DMARdEn <= 0; DMAWrErr <= 0;

35

QL5032 Users Guide

DMARdErr <= 0; LocalEn <= 0; Mst_LatCntEn <= 0; Mst_RdCmd <= 2b01; // 01 = Memory Read end else begin // default values DMAWrErr <= DMAWrErr; DMARdErr <= DMARdErr; DMAWrEn <= DMAWrEn; DMARdEn <= DMARdEn; LocalEn <= LocalEn; Mst_LatCntEn <= Mst_LatCntEn; Mst_RdCmd <= Mst_RdCmd; if (WrCtrl && !Usr_CBE[3]) begin Mst_LatCntEn <= Usr_WrData[31]; LocalEn <= Usr_WrData[30]; Mst_RdCmd <= Usr_WrData[27:26]; DMARdErr <= Usr_WrData[25]; DMARdEn <= Usr_WrData[24]; end if (WrCtrl && !Usr_CBE[1]) begin DMAWrErr <= Usr_WrData[9]; DMAWrEn <= Usr_WrData[8]; end if ((Mst_Tabort_Det || Mst_TTO_Det) && Mst_RdMode) begin DMARdEn <= 0; DMARdErr <= 1; end if (RdCnt0 && DMARdEn == 1) DMARdEn <= 0; if ((Mst_Tabort_Det || Mst_TTO_Det) && Mst_WrMode) begin DMAWrEn <= 0; DMAWrErr <= 1; end if (WrCnt0 && DMAWrEn == 1 && Mst_WrBurst_Done) DMAWrEn <= 0; end end assign DMACtrlStat = {Mst_LatCntEn,LocalEn,2h0, Mst_RdCmd[1:0],DMARdErr,DMARdEn, 8h00,4h0, 2h0,DMAWrErr,DMAWrEn, 8h00};

This always block sets and initializes each bit in the Control/Status DMA Register. The bits included in this register are: Register LocalEn Mst_LatCntEn Bits 30 31 Function Unused in DMA Controller. Can be used as a local chip enable. Use to Enable or Disable the Master Latency Counter PCI Compliance requires this bit to be enabled, but embedded Systems may clear this bit for better PCI performance. See the PCI32 block description for more information. Used to select which PCI Read command is used in Read DMA 0x = Memory Read 10 = Memory Read Line 11 = Memory Read Multiple See the PCI32 block description for more information. Set when a Read DMA is interrupted by a Target Abort, or Target Time Out. Cleared by writing a 0 to bit 25. Write a 1 to this bit to begin a Read DMA. It is reset when the DMA completes or an error occurs. Set when a Write DMA is interrupted by a Target Abort, or Target Time Out. Cleared by writing a 0 to bit 9.

MstRdCmd[1:0]

27:26

DMARdErr DMARdEn DMAWrErr

25 24 9

36

QL5032 Users Guide

DMAWrEn

Write a 1 to this bit to begin a Write DMA. It is reset when the DMA completes or an error occurs.

The final assign statement merges these bits into the DMA Control/Status Register.
// Create read/write enable control for state machine assign PCIWrEn = DMAWrEn && BusMstEn; assign PCIRdEn = DMARdEn && BusMstEn; // Current Burst assign BCnt_eq_3 assign BCnt_eq_2 assign BCnt_eq_1 Count Decodes = (Burst_Cnt == 3); = (Burst_Cnt == 2); = (Burst_Cnt == 1);

This section of the DMA Controller marks the boundary from the DMA register setup to the DMA Burst State Machine. The first two assignments set up the main enable signals: PCIWrEn and PCIRdEn, which are used in the DMA Burst State Machine. These equations AND together the DMA Control Registers which enable the DMA transfers, DMAWrEn and DMARdEn, with the PCI Configuration Space parameter which enables Master Transfers, BusMstEn. The next three assign statements are decodes from the Burst Counter. These decodes are used within the DMA Burst State Machine and for creating outputs. The Burst Counter keep track of how many transfers remain in the current DMA Burst, which is limited in length by the parameter Burst_Length.
// Check for end of DMA assign RdBCnt = (DMARdCnt > Burst_Length) ? Burst_Length : DMARdCnt[7:0]; assign WrBCnt = (DMAWrCnt > Burst_Length) ? Burst_Length : DMAWrCnt[7:0]; always @(posedge PCI_clk) begin if (RdBCnt == 0) RdCnt0 = 1; else RdCnt0 = 0; if (WrBCnt == 0) WrCnt0 = 1; else WrCnt0 = 0; end

The first two assign statements create the next value to be loaded into the Burst Counter. It is separated into a Read Burst Count (RdBCnt) and a Write Burst Count (WrBCnt). The way these values are initialized is that if the current DMA Counter is larger than the Burst_Length parameter, then the Burst_Length parameter is used. Otherwise, the the Burst Count is loaded from the remaining value in the DMA Counter. The always block creates two signals: RdCnt0 and WrCnt0, which go active to indicate that the Read and Write DMA transfers are complete.
// DMA State Machine always @ (DMASm or WrRdy or PCIWrEn or BCnt_eq_1 or BCnt_eq_2 or LastWr or RdRdy or PCIRdEn or WrCnt0 or RdCnt0 or Mst_RdBurst_Done or Mst_Last_Cycle or DMAWrErr or DMARdErr or Mst_WrBurst_Done or Mst_Xfer_D1) begin : StateEqns // default values to prevent loops NxtDMASm = idle; Mst_Burst_Req <= 0; Mst_WrData_Valid <= 0; if (DMASm[idle_bt]) begin // Back end ready, software enabled, and >0 transfers remain if (WrRdy && PCIWrEn && !WrCnt0) NxtDMASm = dma_wr; else if (RdRdy && PCIRdEn && !RdCnt0) NxtDMASm = dma_rd; else NxtDMASm = idle; end if (DMASm[dma_rd_bt]) begin if ((BCnt_eq_1 || (BCnt_eq_2 && Mst_Xfer_D1)) && Mst_Last_Cycle) // if 1 left, and we are in the last transfer cycle

37

QL5032 Users Guide

NxtDMASm = dma_rd_wt;// go to tx wait state to wait for last transfer else if (DMARdErr) NxtDMASm = idle; else NxtDMASm = dma_rd; Mst_Burst_Req <= 1b1; end if (DMASm[dma_rd_wt_bt]) begin //wait for the read pipeline to clear if (Mst_RdBurst_Done || DMARdErr) NxtDMASm = idle; else NxtDMASm = dma_rd_wt; end if (DMASm[dma_wr_bt]) begin if (BCnt_eq_1 || LastWr) NxtDMASm = dma_wr_wt; else if (DMAWrErr) NxtDMASm = idle; else NxtDMASm = dma_wr; Mst_Burst_Req <= 1b1; Mst_WrData_Valid <= 1b1; end if (DMASm[dma_wr_wt_bt]) begin if (!DMAWrErr && !Mst_WrBurst_Done) begin NxtDMASm = dma_wr_wt; Mst_Burst_Req <= 1; end else begin NxtDMASm = idle; Mst_Burst_Req <= 0; end end end // State registers always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMASm <= idle; else DMASm <= NxtDMASm;

This is the DMA Burst State Machine. The first always block determines the value of the next state (NxtDMASm), based on the value of the current state (DMASm). The second always block transfers the next state value to the current state on the rising edge of the PCI clock. The DMA Burst State Machine consists of 5 states. The simplified state diagram is shown below.
DMAWrErr DMARdErr

idle
WrRdy and PCIWrEn and not(WrCnt0) RdRdy and PCIRdEn and not(RdCnt0)

wr
(BCnt_eq_1 or (BCnt_eq_2 and Mst_Xfer_D1)) and Mst_Last_Cycle DMAWrErr or Mst_WrBurst_Done DMARdErr or Mst_RdBurst_Done

rd
BCnt_eq_1 or Last_Wr

wr_wt

rd_wt

38

QL5032 Users Guide

IDLE state: This is the default state. When PCI_reset is asserted, this is the state to which the state machine initializes. If the PCI Driver has enabled Master DMA Read transfers (PCIRdEn) and the local design is ready to receive read data (RdRdy) and the there are NOT zero transfers in the DMA Read Counter (~RdCnt0), then the state machine moves to the READ state. If these conditions are false, but the same conditions do exist for a Write DMA transfer, then the state machine moves to the WRITE state. Therefore, Read DMA has priority. In some applications, this may not be as desirable as the Write DMA having priority. Changing the propriety would simply involve reversing the order of the if statements within this state. READ state: In this state, the Mst_Burst_Req signal is asserted, which indicates to the PCI core that a PCI burst is requested (causes REQ# to be asserted on the PCI bus). If a DMA Read Error (DMARdErr) is detected, then the state machine will move into the IDLE state. If there is one read transfer left and the PCI core signals that the last transfer cycle has started (Mst_Last_Cycle), then the state machine moves into the READ_WAIT state. In order to detect one read remaining, the state machine looks at BCnt_eq_1 (a decode of the Burst counter which indicates that one transfer remains), or Bcnt_eq_2 AND Mst_Xfer_D1. The second term (BCnt_eq_2 && Mst_Xfer_D1) is needed because if the PCI Core is bursting read transfers at zero wait state, the Burst Counter is always one cycle behind the PCI bus. READ_WAIT state: In this state, the state machine waits for one of two events to occur. If a DMA Read Error (DMARdErr) occurs, then the state machine moves back to the IDLE state. Otherwise, the READ_WAIT state waits until the Mst_RdBurst_Done signal goes active, which indicates that the last read data for the PCI Burst has been read from the PCI bus, and has transitioned to the back end. This frees up the PCI core for a new DMA operation, so the state machine moves into the IDLE state. WRITE state: In this state, the Mst_Burst_Req signal is asserted, which indicates to the PCI core that a PCI burst is requested (causes REQ# to be asserted on the PCI bus). Also, the Mst_WrData_Valid signal is asserted while in this state. This tells the PCI core that valid write data is now ready to be transferred to the PCI Core. If a DMA Write Error (DMAWrErr) is detected, then the state machine will move into the IDLE state. If there is one write transfer left in the Burst Counter (Bcnt_eq_1), or the back end indicates that the Last Write (LastWr) data is now being sent to the PCI core, then the state machine moves to the WRITE_WAIT state. WRITE_WAIT state: If a DMA write error (DMAWrErr) occurs, or the PCI core indicates that the last write data has been written to the PCI bus (Mst_WrBurst_Done), then the state machine will transfer from this state into the IDLE state. Otherwise, the state machine will wait here for the final data elements in the PCI core write pipeline to get written to the PCI bus. This allows the PCI Core to free up its internal datapath before moving to a new PCI burst transfer. One important situation in this state involves the Mst_Burst_Req signal. This signal must be kept active until the Mst_WrBurst_Done signal is detected. The reason for this is that the PCI Target may ask for a transaction to be retried, so the DMA Controller must continue to request burst transfers on the PCI bus until the last data has transferred. Also, once Mst_WrBurst_Done has been detected, the Mst_Burst_Req signal must be set to 0 on the same clock cycle. It is important not to request a new PCI burst transaction because all DMA operations may be complete.
// DMA assign assign assign Burst Counter Init_BCnt = (NxtDMASm[dma_wr_bt]) ? WrBCnt : RdBCnt; Ld_BCnt = DMASm[idle_bt]; Dec_BCnt = (DMASm[dma_rd_bt] & Mst_Xfer_D1) || Mst_WrData_Rdy;

dcount8 BRSTCNTR ( .CLK(PCI_clk), .CLR(PCI_reset), .D(Init_BCnt), .EN(Dec_BCnt), .LOAD(Ld_BCnt),.Q(Burst_Cnt));

This area of the DMACNTL.V file describes the interface with the Burst Counter, which uses the Burst_Cnt bus as its output. The loading of new Burst Counts (controlled by Ld_BCnt) happens while the DMA Controller is in the IDLE state (DMASm[idle_bt]). The Init_Bcnt bus holds the data to be loaded into the Burst Counter. It is chosen between the read (RdBCnt) and write (WrBCnt) counts by

39

QL5032 Users Guide

looking at the next state for the DMA burst state machine (NxtDmaSm[dma_wr_bt]). The Burst Counter is decremented whenever a PCI transfers occurs (Mst_Xfer_D1) while in the READ state (DMASm[dma_rd_bt]), or whenever Mst_WrData_Rdy is asserted by the PCI Core (which happens when it is ready for new write data while the DMA Controller is in a WRITE state). Since the Burst Counter is decremented in read mode with the Mst_Xfer_D1 signal (which indicates that a PCI transfer occurred on the previous clock), the Burst Counter will be one count behind on writes, while a 0-wait state burst is occurring on the bus.
// DMA Burst State Machine Outputs assign Mst_One_Read = (BCnt_eq_1 | (BCnt_eq_2 && Mst_Xfer_D1)) && DMASm[dma_rd_bt]; assign Mst_Two_Reads = (BCnt_eq_2 | (BCnt_eq_3 && Mst_Xfer_D1)) && DMASm[dma_rd_bt]; assign Mst_RdMode = DMASm[dma_rd_bt] | DMASm[dma_rd_wt_bt]; always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) Mst_WrMode <= 1b0; else Mst_WrMode <= NxtDMASm[dma_wr_bt] | NxtDMASm[dma_wr_wt_bt]; decoded Mst_WrMode endmodule

// Pre-

The last section of the DMACNTL module makes up the outputs of the DMA Controller which are generated from the DMA State Machine and Burst Counter. Mst_One_Read indicates that one read remains in the DMA Read Burst Operation. Mst_Two_Reads indicates that Two Reads remain in the DMA Burst Read operation. The PCI Core needs these signals to properly end zero-wait state burst reads. The Mst_RdMode and Mst_WrMode signals indicate to the PCI core that the Burst operation is a read or a write. The Mst_Wr_Mode signal is pre-decoded and registered in order to improve cycle time.

40

QL5032 Users Guide

Appendix C: Target Configuration Space and Address Register


The QL5032 Target Configuration Space and Address Register is a required component of any QL5032 design. It is provided as a macro because it must be customized by the user, and therefore cannot be fixed in silicon. This Appendix describes the Target Configuration Space and Address Register block provided by QuickLogic. The macro provided by QuickLogic is called CFGTADDR. CFGTADDR is a Verilog block that has three logical sections. The first section describes the PCI configuration space. These are a set of registers that define the features of the PCI interface and provide status information. The second section is the address register/counter. It latches the PCI address at the start of a PCI transaction, and automatically increments it at the completion of each data transfer. The last section is the command decode logic. It detects user read and user write signals, mapped from the various PCI commands. Each of the sections are described below.

PCI Configuration Space


The sections of Verilog code related to the PCI configuration space will be described section by section.
// config space ports input [3:0] CBE; input [31:0] WrData; input Cfg_Write, PCI_clock, MstPERR_Det, PERR_Det, TTO_Det, PCI_reset, SERR_Sig, Tabort_Det; output [31:0] CfgData; output [15:0] CmdReg; output [7:0] LatTimerReg; reg [31:0] CfgData; wire [15:0] CmdReg; wire [7:0] LatTimerReg;

These port and signal declarations define the input and output ports of CFGTADDR that relate to the PCI configuration space. CBE[3:0] WrData[31:0] Cfg_Write PCI_clock MstPERR_Det PERR_Det TTO_Det PCI_reset SERR_Sig Tabort_Det CfgData[31:0] Byte enables, used during PCI configuration writes. Write data, used during PCI configuration writes. Configuration write-enable, active during PCI configuration write transactions. PCI clock. Parity error detected by the QL5032 bus master. When this signal is active, bit 8 of the Status register must be set. Parity error detected. When this signal is active, bit 15 of the Status register must be set. Received Master Abort. When this signal is active, bit 13 of the Status register must be set. PCI system reset. Signalled system error. When this signal is active, bit 14 of the Status register must be set. Received target abort. When this signal is active, bit 12 of the Status register must be set. Configuration data output, multiplexed based on the current PCI address (stored in the address register/counter, which is described in the following section). Used during configuration read transactions. Copy of the Command register in the configuration space. Copy of the Latency Timer in the configuration space.

CmdReg[15:0] LatTimerReg[7:0]

41

QL5032 Users Guide

The following lines of Verilog declare the internal 32-bit busses that are generated from the individual PCI registers. These are multiplexed to drive the CfgData[31:0] output port, which is used during PCI configuration read transactions.
// *** Full wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] 32-bit wide PCI registers Dev_Vend; // Stat_Cmd; // Class_RevID; // BIST_Hdr_Lat_Cache; // BAR0; // SubsysID_SubsysVendID; // Lat_Gnt_IntPin_IntLine; // offset 00h 04h 08h 0Ch 10h 2Ch 3Ch

The next section of Verilog code represents values most likely to be modified by users. These represent numbers and values that will uniquely identify the device, along with the properties and capabilities of the device. Please refer to the PCI specification for detailed information about these registers.
// *********** beginning of user-modifiable parameters ************ // PCI registers offset into config space wire [15:0] DeviceID = 16h0001; // 00h wire [15:0] VendorID = 16h11E3; // 00h wire [23:0] ClassCode = 24h020000; // 08h wire [7:0] RevisionID = 8h01; // 08h wire [15:0] SubsysID = 16h0001; // 2Ch wire [15:0] SubsysVendID = 16h11E3; // 2Ch wire [7:0] MaxLat = 8h05; // 3Ch wire [7:0] MinGnt = 8h02; // 3Ch wire [7:0] IntPin = 8h01; // 3Ch parameter BAR0_size = 24; // Sets the size of the requested memory space. // Default value is 24, corresponding to 16MB. // (# of bits to tie off in the BAR) // *********** end of user-modifiable parameters ************

The remainder of the Verilog code that describes the PCI configuration space will not need to be modified by the user in most cases. However, there are a few exceptions.

Using I/O Address Space


If any base-address registers represent I/O space instead of memory address space, some minor modifications need to be made. First, the line that defines the IOEnable signal needs to be rewritten from:
wire IOEnable = 0;

to:
reg IOEnable;

Furthermore, an additional always block must be added:


always @(posedge PCI_clock or posedge PCI_reset) if (PCI_reset) IOEnable <= 0; else if (!CommandLoWE) IOEnable <= IOEnable; else IOEnable <= WrData[0];

Adding Base-Address Registers


To add additional base-address registers, search through the Verilog code and identify locations where BAR0 is mentioned. The example configuration space contains only one base-address register, therefore

42

QL5032 Users Guide

only BAR0 (offset 10h) is used. BAR1 through BAR5 may be added, provided the appropriate changes are made to the Verilog source. The example below will show the user how to add an additional base-address, BAR1. Add a wire declaration for the new base-address:
wire [31:0] BAR0; wire [31:0] BAR1; //added

Add a new paramter to determine the size of BAR 1:


parameter BAR0_size = 24; parameter BAR1_size = 24; //added

Declare a new register for BAR 1 and map it to the BAR1 wire declared earlier:
reg [31:0] BAR0_reg; reg [31:0] BAR1_reg; //added assign #1 BAR0 = BAR0_reg; assign #1 BAR1 = BAR1_reg; //added

Create the write-able for BAR 1. Be sure to properly decode the address:
wire BAR0WE = (Cfg_Write & (!CBE[3]) & (UsrAddr[4] & !UsrAddr[2] & !UsrAddr[3] & !UsrAddr[5] & !UsrAddr[6] & !UsrAddr[7] & !UsrAddr[8])); wire BAR1WE = (Cfg_Write & (!CBE[3]) & (UsrAddr[4] & UsrAddr[2] & !UsrAddr[3] & !UsrAddr[5] & !UsrAddr[6] & !UsrAddr[7] & !UsrAddr[8])); //added

Insert the following lines that describe how BAR 1 will be written to during PCI configuration write transactions:
always @(posedge PCI_clock or posedge PCI_reset) begin if (PCI_reset) BAR1_reg <= 0; else if (!BAR1WE) begin BAR1_reg[31:BAR1_size] <= BAR1_reg[31:BAR1_size]; BAR1_reg[BAR1_size-1:0] <= 0; end else begin BAR1_reg[31:BAR1_size] <= WrData[31:BAR1_size]; BAR1_reg[BAR1_size-1:0] <= 0; end end

Next, make sure the new base-address is properly muxed into CfgData[31:0]:
always @(posedge PCI_clock) begin case (selcfg) 7'b0000000: CfgData <= Dev_Vend; 7'b0000001: CfgData <= Stat_Cmd; 7'b0000010: CfgData <= Class_RevID; 7'b0000011: CfgData <= BIST_Hdr_Lat_Cache; 7'b0000100: CfgData <= BAR0; 7'b0000101: CfgData <= BAR1; //added 7'b0001100: CfgData <= 32'h0; 7'b0001111: CfgData <= Lat_Gnt_IntPin_IntLine; default: CfgData <= 32'h0; endcase end

43

QL5032 Users Guide

The last step is to make sure that an address hit is properly determined, and now accounts for the new base-address register. The following line must be changed from:
assign Addr_Hit = (MemEnable && (WrData[31:24] == BAR0[31:24]));

to:
assign Addr_Hit = (MemEnable && ( (WrData[31:BAR0_size] == BAR0[31: BAR0_size]) || (WrData[31:BAR1_size] == BAR1[31:BAR1_size])));

Address Register/Counter
The address register/counter is simply a loadable counter. It is capable of latching the PCI address and holding it, and it is also capable of incrementing the address by 4 at the completion of a PCI data transfer.
always @(posedge PCI_clock or posedge PCI_reset) if (PCI_reset) UsrAddr[23:0] <= 0; else if (LoadAddr) UsrAddr[23:0] <= WrData[23:0]; else if (IncrAddr) begin UsrAddr[23:10] <= UsrAddr[23:10]; UsrAddr[9:2] <= UsrAddr[9:2] + 1; UsrAddr[1:0] <= UsrAddr[1:0]; end

Additionally, this section of Verilog code determines when the QL5032 should claim a PCI target transaction. If the address sent at the beginning of a PCI transaction belongs to one of the base-addresses implemented in the device, the Addr_Hit signal must go active. The following equation performs this function:
assign Addr_Hit = (MemEnable && (WrData[31:BAR0_size] == BAR0[31:BAR0_size]));

Command Decode Logic


The command decode logic is simply a pair of comparators. One comparator generates the Usr_RdCmd output, based on the PCI command sent at the beginning of a new PCI transaction. The other comparator generates Usr_WrCmd. Each of these commands may be mapped to any PCI commands that the user chooses. In the example CFGTADDR included, the Usr_RdCmd output is mapped to the Memory Read, Memory Read Line, and Memory Read Multiple PCI commands. The Usr_WrCmd is mapped to the Memory Write PCI command.
// map usr_read to mem read (0110), mem read mult (1100), or mem read line (1110) assign Usr_RdCmd = (LoadAddr && ((CBE == 4b0110) || ({CBE[3:2],CBE[0]} == {3b110}))); // map usr_write to mem write (0111) assign Usr_WrCmd = (LoadAddr && (CBE == 4b0111));

44

QL5032 Users Guide

Appendix D: Technical Support and FAQ


There are many options for technical support when working on a QL5032 design project. These consist of online resources, email, and telephone support. A brief breakdown of the resources and how to access them is listed below.

Online Resources
Help with QuickLogic Software and Devices Help with PCI Specification and Protocol Online FAQ for the QL5032 device www.quicklogic.com/support www.pcisig.com www.quicklogic.com/support/ql5032

Technical Support via email


For QuickLogic Software and Devices support@quicklogic.com

Telephone Support
QuickLogic Customer Engineering Hotline (408) 990-4100

FAQ for the QL5032


This list of Frequently Asked Questions for the QL5032 was last updated on 2/3/1999. For a more recent version, visit the QL5032 technical support page on the Web at www.quicklogic.com/support/QL5032. Question: How many logic cells are used by the reference DMA Controller, and Target Configuration and Addressing modules? What resources are available in the QL5032 after I use these reference modules? In the default implementation, the DMA Controller module requires 130 logic cells, and the Target Configuration and Addressing module requires 100 logic cells. However, simple modifications can significantly change the size of these modules, if necessary. Decreasing the maximum DMA Length or the DMA Burst Length will decrease the size of the DMA Controller module. See the detailed descriptions of these modules in the QL5032 Users Guide appendixes. The QL5032 has approximately 392 logic cells available before the DMA Controller and Config space are considered. If the default reference modules are used, then approximately 162 logic cells and 14 RAM modules are available to the designer for additional back end glue logic. Can the QL5032 support DMA linking or chaining (also referred to as scatter-gather)? The short answer is yes. The longer answer is that you will need to customize the DMA block to achieve this functionality. One strategy would be to create a 64 deep/32-wide synchronous FIFO on the QL5032, and a DMA register bit which identifies this FIFO as the target for the next DMA transaction. Then burst a set of addresses, size, and transaction type (read/write) info to this FIFO. If you will always be doing equal size transactions of the same type, you can omit the size and type information. Then put the DMA controller back in normal mode. When the DMA controller sees data in the chain FIFO, it will begin the chain DMA function by loading address and size info from the chain FIFO, until the FIFO is empty. Can the QL5032 support PCI Interrupts to signal the end of the DMA transaction, or data waiting to be sent to PCI memory? Absolutely. The pin on the PCI core you would use is Usr_Interrupt. A register should drive this pin. The register can be set by the DMA controller on any condition you desire. You simply need to set up the software driver to write to a DMA register which will clear this bit once the interrupt has been detected. Can the QL5032 support Memory Read Line and Memory Read Multiple commands.

Answer:

Question: Answer:

Question: Answer:

Question:

45

QL5032 Users Guide

Answer:

Yes. The DMA controller can set the type of read operation on the PCI bus with the Mst_Rd_Sel[1:0] pins. See the Appendix titled Functional Description of the QL5032 PCI Controller in the QL5032 Users Guide for how to use these pins. What is the maximum burst speed for the QL5032? The QL5032 will send and receive data with zero wait states, assuming the target it is talking to can also achieve that speed. The QL5032 fully supports Memory Read, Memory Read Line, and Memory Read Multiple commands so optimal read performance can be achieved on a variety of motherboards. In Target Mode, the QL5032 is only used for DMA configuration and PCI Configuration, so full burst (0-wait state) operation is not needed in Target mode. All high performance reads and writes on the PCI bus should therefore be executed in PCI bus Master mode. Target transactions will operate with two to four wait states inserted by the QL5032 device. What is the Target Latency of the QL5032? The table below shows the Target latency in terms of initial latency (assertion of FRAMEN by a master until TRDY is asserted the first time by the QL5032 Target) and subsequent latency (wait states between subsequent assertions of TRDYN by the QL5032 Target). In high performance applications, the target interface of the QL5032 is primarily intended for PCI and DMA configuration. Transaction Type Configuration Read Configuration Write Memory/IO Read Memory/IO Write Target Initial Latency (FRAME to TRDY) 6 4 5 4 Target Wait States (Target Subsequent Latency) 4 2 3 2

Question: Answer:

Question: Answer:

Question: Answer:

Does the QL5032 generate and/or receive Type 1 Configuration commands? The QL5032 can be configured to recognize Type 1 Configuration commands (by default, however, it will ignore them), but it can not generate Type 1 Configuration command cycles in Master mode. These commands are only required of applications which are a PCI bus host, or a PCI to PCI bridge. Can I control the Byte Enables on the PCI bus during Master transactions? The 4 CBE signals are always active (set to 0) during PCI Master transactions, so you must transfer data on an aligned DWORD boundary. This was done to optimize the size and speed of the QL5032 device. Can the QL5032 act as a PCI host? Since this would require the Master interface to generate PCI Configuration Read and Write commands (Type 0 and Type 1), the QL5032 does not have this capability. The Master Controller can only generate Mem Read, Mem Read Line, Mem Read Multiple, and Mem Write commands. What are the limitations of the local clock on the PCI add-in card, when I use the QL5032 as the PCI bus interface. The QL5032 allows the user to generate the local clock by passing the PCI clock through to an output pin, or the user may create a local clock on the board. The synchronization between these clocks is best accomplished by asynchronous FIFOs created in the programmable logic region of the device. For a definition of asynchronous FIFOs in this context, see the FIFO chapter of the QL5032 Users Guide. Any local clock speed from 0 to 175 MHz can be used, depending on the complexity of the logic that the local clock drives in the QL5032 device. The local clock does not need to have any frequency or phase relationship to the PCI clock. The designer may choose to use the PCI clock as the local clock, or use an independent local clock.

Question: Answer:

Question: Answer:

Question: Answer:

46

QL5032 Users Guide

Question: Answer:

Does the QL5032 have any problems with Special Cycles or Master-Aborts on the PCI bus? No, the QL5032 responds correctly to all Special Cycles and Master Aborts which occur on the PCI bus. With which motherboards has the QL5032 been tested? The QL5032 device should work with all PCI compliant motherboards. It has currently been tested (not exhaustively) with a Pentium Pro 200 system running Windows NT 4.0, a Dell GXa (Pentium II), a CTX (Celeron), and a Compaq Server 1600 (500 MHz Pentium). For a more complete list of the motherboards, chipsets, and BIOS versions tested with the QL5032, see the QL5032 support page (www.quicklogic.com/support/QL5032). How was the QL5032 tested? A comprehensive simulation was performed both functionally and with timing. This simulation simulated other agents on the bus and watched for bus protocol violations. A reference board was created and a thorough set of tests was performed with the HP E2925B PCI Analyzer. All tests in the PCI Compliance checklist were performed to verify PCI compliance. Also, the QL5032 was taken to a PCI Special Interest Group PCI Compliance Workshop, where it was tested with several other manufacturers motherboards. [These comments will be valid upon completion of the internal verification plan, scheduled for completion by 4/9/99]. Does the QL5032 support unlimited Master bursts? Yes, in two ways. First the PCI Master can continue bursts for as long as its latency timer does not time out and the bus is still granted to it. Second, the latency timer can be disabled (which would violate PCI spec but may be OK for an embedded system), so that the burst may continue as long as the back end is ready to send/receive data. The DMA Controller reference design can be customized for any burst length and maximum DMA length, to optimize the bandwidth requirements of an application. What if my back end design is slow during Master Transactions? Will the Master insert wait states to compensate for the data being slow? No. The Master will end the current burst as soon as the back end is no longer ready to send or receive data at the proper rate. It is therefore recommended with slow back-ends to fill a FIFO with data before initiating a Master Write, and provide an empty FIFO before initiating Master reads, in order to provide the best performance on the PCI bus. 64 deep/32 wide FIFOs are relatively inexpensive to implement within the QL5032, so this should not be a limitation for any application. What values should I use for Device ID, Vendor ID, Subsystem ID, and Subsystem Vendor ID? The Vendor ID and Subsystem ID are supplied by the PCI Special Interest Group (PCI SIG www.pcisig.com), when you sign up to be a member. This is a necessary step for PCI device and board designers. The Vendor ID should be the Vendor ID assigned to the designer of the device. In the case of the QL5032, the Vendor ID should belong to the company which is designing and customizing the QL5032 device. The Subsystem Vendor ID should belong to the company which builds the PCI add-in board which uses the QL5032. In some cases, this may be the same company. In that case, the same Vendor ID may be used in both fields. Device ID and Subsystem ID are numbers which should be unique to each design, but which are managed by the company doing the design, not by the PCI SIG. In the PCI 2.1 Specification, the Subsystem ID and Subsystem Vendor ID were optional. In the PCI 2.2 Specification, these values in the configuration space are now mandatory. Can the QL5032 be used as a PCI to PCI Bridge? No, since the QL5032 will not respond to Type 1 Configuration Accesses or Generate Type 1 or Type 2 Configuration Accesses, it cannot be used as a transparent PCI to PCI bridge. If

Question: Answer:

Question: Answer:

Question: Answer:

Question: Answer:

Question: Answer:

Question: Answer:

47

QL5032 Users Guide

your application just needs to send data from one PCI bus to another, then a limited PCI to PCI bridge function may be possible. The solution would require custom drivers, with each PCI bus having independent configurations. Two QL5032s would be needed, one for each bus, and each would have memory space assigned to it on each bus by the configuration master. Question: Answer: How does the speed of the FPGA portion of the QL5032 compare with QuickRAM devices from QuickLogic? The QL5032 has two speed grades: A and B. The A grade is similar in speed to a -3 device in the QuickRAM family. The B grade is similar in speed to a -4 device in the QuickRAM family. How many BARs (Base Address Registers) can be used in the QL5032. Up to six 32-bit BARs may be used in the QL5032. The BAR configuration is handled within the Target Configuration and Addressing module, which is described in the Appendix of the QL5032 Users Guide. The default implementation is one BAR, with a 16 MByte address range. Does the QL5032 support mailbox queues and I20? Mailboxes are the PCI-interface level requirement to support I20. Mailboxes can be used for a variety of other purposes as well. One way of describing mailboxes is as a set of registers that allow two-way communication between a back-end processor and the system software on the PCI bus. Mailbox registers and doorbells can be implemented in the user-side of the FPGA using a state machine, logic cell registers, and RAM cells. For full I20 compliance, a processor is needed on the same card as the QL5032, with a significant amount of RAM. See the I20 specification for more information. You can obtain an I2O specification from the I2O SIG (Special Interest Group). Go to www.i2osig.org for more information. Does the QL5032 support the new PCI-X standard. What is PCI-X? PCI-X, initally proposed by Compaq, Hewlett-Packard and IBM, is a proposed extension to the PCI local bus specification to potentially deliver increased bandwidth and bus performance, running at speeds up to 133MHz. This proposal is aimed at addressing I/O bandwidth for servers running enterprise applications such as Gigabit Ethernet, Fiber Channel and Ultra3 SCSI. QuickLogic is very interested in PCI-X and is continually monitoring the progress and information being released on PCI-X. Since there is limited information on PCIX and there has been no preliminary specification on PCI-X that has been released to the public, we are unable to speak directly about PCI-X at this time. However, QuickLogic is committed to being the leading supplier of ESP products and drive to provide solutions to the leading edge system requirements. Does the QL5032 support Compact PCI or Hot Swapping? The QL5032 can support Compact PCI. The main differences between PCI and CompactPCI are mechanical, and the fact that CompactPCI supports Hot Swapping. Contact QuickLogic for the availability of an application note and reference design.

Question: Answer:

Question: Answer:

Question: Answer:

Question: Answer:

48

QL5032 Users Guide

Glossary
BAR
Base Address Register. Refers to the 32-bit Registers in the PCI Configuration Space at offset 0x10 through 0x27. Up to six Base Address Registers may be defined for a given PCI device. The Base Address Register is a writeable register. The PCI system software (BIOS) first determines how much memory space an address register needs, and then assigns a memory address to each base address register. The PCI device must respond to addresses that fall within its base address range. The QL5032 may have from one to six BARs. The default is one BAR, with a 16 MB address range (in the Target Configuration and Addressing referencemodule).

BGA
Ball Grid Array. This refers to the Plastic Ball Grid Array (PBGA) packages which QuickLogic offers in many of its device families.

DMA
Direct Memory Addressing. DMA describes a hardware system independent of the primary processor that can carry out memory addressing. Usually, a DMA state machine will be configured with a source address, destination address, and size, and move data from one location to another. In the case of the QL5032, it refers to the DMA Controller Reference module, which can be configured through the PCI Interface with a PCI Address, Size of Transfer, and Type of Transfer (read or write). The DMA controller will independently control the PCI Interface to complete the transfer of data to or from the PCI bus.

DWORD
Double-Word. Refers to a 32-bit piece of data.

FIFO
First-in, First-out. Refers to a memory that has no direct addressing. The user pushes data elements into the memory and pops data elements out of the memory. The internal addressing structure is designed so that the first element of data that is pushed into the memory is the first piece of data that is retrieved when a pop is performed (and so on).

FPGA
Field-programmable gate array. This is a device which consists of an array of general-purpose digital logic cells or modules interconnected by wires. The wires are designed with interconnections that can be programmed by a designer in the field. This is as opposed to ASICs, which must be programmed or masked within a factory. QuickLogic provides many FPGA families, and the programmable region of the QuickPCI family of devices is sometimes referred to as the FPGA region or FPGA side.

GLCK
Global clock. A GCLK buffer drives a pre-buffered network on QuickLogic devices, which offers very high fanout nets with very low latency and skew. Therefore, GCLKs are ideal for clock distribution. However, GCLKs also can be routed to I/O buffer output enables, the asynchronous sets and resets of all flip-flops, the F1 input of each logic cell, and both the read and write clocks of RAM modules. This makes them very flexible. QuickLogic devices offer between 4 and 8 GCLK networks. Often, a subset (usually 2) can be accessed directly from pins on the device (labeled GCLK/I pins).

49

QL5032 Users Guide

HDL
Hardware Description Language. This term is most often used in this Users Guide to represent both Verilog and VHDL with one term.

I/O
Input/Output. This refers to a pin on the device.

LFSR
Linear Feedback Shift Register. This is a type of counter which is designed as a shift register where certain bits (called taps) are fed back into an XOR gate, often into the first bit of the shift register chain. If the correct taps are used, the LFSR will so through a long series of non-repeated states, and start again at the first state. Because of the simple design, LFSRs can be used as high-speed counters. However, because the states do not progress in an intuitive pattern (like a binary counter), the decode logic can be more complex. QuickLogic uses LFSRs for its high-speed FIFOs.

pASIC
Programmable ASIC. QuickLogic uses this term for its programmable devices. Since the Via-Link interconnect technology makes the QuickLogic devices look more like ASICs than traditional FPGAs, this acronym was created to describe QuickLogic programmable devices.

PBGA
Plastic Ball Grid Array. See BGA or Ball Grid Array.

PCI
Peripheral Component Interconnect.

PCI SIG
PCI Special Interest Group. See their Web site at www.pcisig.com.

PQFP
Plastic Quad Flat Pack. A device package offered by QuickLogic which is surface mounted on the PCB (printed circuit board).

RAM
Random Access Memory. QuickLogics QuickRAM and QuickPCI families have modules of programmable RAM, which consist of up to 1152-bits of RAM per module. See the appropriate device family datasheet for details.

ROM
Read-only Memory. QuickLogics RAM modules can be configured as ROMs by attaching a Serial EEPROM to the QuickLogic device. At power-up, the data from the Serial EEPROM will be loaded into the QuickLogic RAM modules, allowing them to be used in a ROM fashion by the design. See QuickNote 65 on the QuickLogic Web site for more information.

SpDE
Seamless pASIC Design Environment. This refers to the application which controls the design flow in the QuickWorks, QuickChip, QuickTools, and QuickTools Plus packages from QuickLogic.

50

You might also like