Professional Documents
Culture Documents
Datasheet
Datasheet
Datasheet
(Preliminary Draft)
March 9, 1999
TABLE OF CONTENTS
Setting up a QL5032 Project ______________________________________________ 1
Step-by-step Project Setup ____________________________________________________ 1
Step 1: Create a QL5032 Project Folder ________________________________________________ 1 Step 2: Copy Template Files to the Project Folder ________________________________________ 1 Step 3: Rename the Top-Level Design Block ____________________________________________ 1 Step 4: Select Mandatory Device Defaults ______________________________________________ 1 Step 5: Open the top-level design block and start designing! ________________________________ 2
Compilation _______________________________________________________________ 11
Step 1. Synthesize the design _______________________________________________________ 11 Step 2. Set up your Design Constraints________________________________________________ 12 Step 3. Logic Optimization, Placement, and Routing _____________________________________ 12 Step 4. Checking Design Timing _____________________________________________________ 13
Waveforms ________________________________________________________________ 18
Single-Dword Configuration Read ___________________________________________________ 18 Burst Configuration Read __________________________________________________________ 19 Single-Dword Configuration Write ___________________________________________________ 20 Burst Configuration Write __________________________________________________________ 21 Target Read _____________________________________________________________________ 22
Description of Registers _____________________________________________________ 30 Detailed Description of the DMACTRL.V file (Verilog) ___________________________ 31
Glossary _____________________________________________________________ 49
ii
Schematic or Verilog designers will be editing a Verilog file (CFGSPACE.V) , while VHDL users will be editing a VHDL file (CFGSPACE.VHD). The appropriate section of the Verilog (.V) file is shown here.
// *********** beginning of user-modifiable parameters ************ // PCI registers offset into config space wire [15:0] DeviceID = 16h0001; // 00h wire [15:0] VendorID = 16h11E3; // 00h wire [23:0] ClassCode = 24h020000; // 08h wire [7:0] RevisionID = 8h01; // 08h wire [15:0] SubsysID = 16h0001; // 2Ch wire [15:0] SubsysVendID = 16h11E3; // 2Ch wire [7:0] MaxLat = 8h05; // 3Ch wire [7:0] MinGnt = 8h02; // 3Ch wire [7:0] IntPin = 8h01; // 3Ch parameter BAR0_size = 24; // Sets the size of the requested memory space. // Default value is 24, corresponding to 16MB. // (# of bits to tie off in the BAR) // *********** end of user-modifiable parameters ************
You must edit the values in the Vendor ID, Subsystem Vendor ID (SubsysVenID) field and Class Code fields. You can obtain a Vendor ID from the PCI SIG. Anyone creating a PCI Interface Card must obtain this ID from the PCI SIG, so that all Vendor IDs are unique. IMPORTANT: You must not use the Vendor ID (or Subsystem Vendor ID) of 16h0000 or 16h11E3 (QuickLogics PCI SIG assigned Vendor ID). Doing so would violate the PCI Specification. The Class Code field simply identifies the type of PCI agent you are creating. For a complete description, see the PCI Specification, revision 2.2. For information on setting default values for other fields in the PCI configuration space, refer to the PCI Specification.
For help with the design process or design flow, refer to the QuickWorks Users Guide. This can be found on the QuickWorks CD in the BOOKS directory (open BOOKS.PDF) or you may have received a hard copy of this manual with your QuickWorks software.
Synchronous or Asynchronous
When building a FIFO in the QL5032 device, the first decision you must make is whether you want a synchronous or asynchronous FIFO. First, some definitions: Synchronous FIFO: A FIFO with a synchronous read port and a synchronous write port which both use the SAME clock. A FIFO with a synchronous read port and a synchronous write port which use DIFFERENT (asynchronous) clocks.
Asynchronous FIFO:
As you see, both types of FIFOs use synchronous interfaces on the read and write ports. The difference lies in whether the read clock and write clock are the same or not.
Select a Module type of FIFO. The Part type list box is not used. Then click NEXT. You will see the window of Figure 2.
Here you need to select a Depth (from a list of choices), and Width (any width). Be aware that each RAM block is a little over 1K bits (1152), so the larger the FIFO you choose, the more RAM blocks you will use in the QL5032. There are 14 RAM blocks available, and you may build one or more FIFOs. The Model selection of Speed Optimized (LFSR Status Counter) is default, and that is recommended for FIFOs which need to operate at > 75 MHz on either the read or the write port. The trade-off is that the LSFR counter makes generating new FIFO flags more difficult, since you need to decode the value from the LFSR counter, instead of a normal binary counter. Contact QuickLogic Customer Engineering if you need assistance with creating different flags (other than empty or full) for a Speed Optimized FIFO. When you finish making selections, click NEXT again to bring up Figure 3.
In this window, you select the netlist format for the FIFO. Choose the language you are most confortable with, in case you want to later edit the FIFO to insert different status flags, or change to a non-standard depth. If you are familiar with neither, we suggest Verilog, since it is a more compact format.
After selecting the output file format (VHDL, Verilog, or both), then click FINISH to bring up the window in Figure 4.
This lists all the files which will be written to your project folder. Select your project folder with the Save As button on this window. The most important filename in this list is the one with the Description of FIFO, since this is the top-level FIFO block which you will instantiate into your Verilog, VHDL, or Schematic file.
and the write side state machines receive the outputs of the comparators to determine the status of the flags. These state machines have two D-flipflops each in order to account for metastability. Keeping in mind the necessity for asynchronous FIFOs, QuickLogic generated a code for a 32x32 asynchronous FIFO. The current asynchronous FIFO design is for a 32 wide by 32 deep FIFO implemented using RAM blocks and logic cells. It is essentially a schematic (F32a32.sch), with Verilog code for the gray code counters and registers. In order to configure the FIFO with a different width or depth, some modifications need to be made. The figure below shows the file names corresponding to the relevant building blocks in the schematic. All the blocks shown in the figure can be scaled to meet the users width and depth requirements. Please note that this figure does not show the read side and write side state machines. RAM Block In the enclosed schematic, a 64X32 RAM block symbol is used. This block was created from Verilog code generated by the RAM/ROM/FIFO Wizard in SpDE. The user can specify the required width and depth of the RAM block in the wizard, which generates the Verilog/VHDL code. Using the New Block Symbol in the Schematic Tools, you can create a schematic symbol for the Verilog code.
almostfull
push AND2i0 Q D
1 1
AND2i0
0
full
Q D R A N D2 i1 A N D2 i1 AND2i1
DSQ
1 1
D SQ AND2i1
empty
topoff Q D AND3i2 R
OR3i0
OR3i0
flush D Q
-2
EQ A[4:0] B[4:0] ECOMP5
P2
-1
AND4i3 R D SQ
A N D2 i1
CLR EN
Waddr+3
REGISTER5-2 Q[4:0]
Waddr+2
Waddr2[4:0]
CLR EN GREY_COUNTER5-2
Raddr+2
Q[4:0]
O R2 i0
O R2 i0
wrst wclk
CLR EN
GREY_COUNTER5-0 Q[4:0]
Waddr+0
Raddr+1
EQ A[4:0] B[4:0] ECOMP5
P2
REGISTER5-1 Q[4:0]
Waddr0[4:0] gnd,Waddr0[4:0] R64X32 wa[5:0] ra[5:0] din[31:0] wd[31:0]rd[31:0] we wclk I_3 re rclk
P2
dout[31:0]
COMPARATORS
The 5-bit comparators shown in the top level schematic are purely schematic designs generated in the schematic editor. The user must note that the number of bits to be compared would change with the width of the RAM block used. For example, if the user defined FIFO depth is 256, 8 address bits are required and hence an 8-bit comparator must be used.
GREY COUNTERS
The Grey code counters shown in the top level schematic are Verilog modules written for a 5-bit Grey counter. This code would have to be modified to 6-bits, 7-bits, 8-bits, and 9-bits for FIFO depths of 64, 128, 256, and 512 respectively. Writing a Verilog module for a Grey counter can be cumbersome, therefore we have provided an example that can generate the Grey code for such wide counters.
REGISTERS
The registers are Verilog modules, which can be easily modified to fit the users depth requirements. There are two registers for the write address and one for the read address. These are used to generate adjacent count values for the determination of the FIFO status i.e. Almost-Full, Almost-Empty, Full, and Empty. This Asynchronous FIFO reference design is provided for the purpose of reducing design cycle time when using the QL5032 device.
The key lines here are the lines that start with //inputs and //outputs. You will copy text from these lines to create the symbol.
As you can see, in this example we have already copied the appropriate information from the FIFO Verilog file into the fields in the New Block Symbol window. The module name is entered for the Block Name, and the text after //inputs and //outputs is copied into the appropriate Input Pins and Output Pins fields. You can actually use Windows Copy and Paste commands to minimize typing errors. Do not use the Use Data From This Block button. This copies all the inputs and outputs of the current schematic into the input and output fields, which is not what you want to do in this case. If you make a mistake, you can always click CANCEL and re-enter the Symbol Creation Tool with the Add > New Block Symbol menu command. When you have entered all data correctly, simply click the RUN button to create the symbol. You will them be automatically carrying the new symbol around with your cursor, and can click to drop it into an appropriate area of the schematic. The resulting symbol is shown below.
If you want to add almost empty and almost full flags to the FIFO, you must first change the Verilog or VHDL file manually, then re-create or edit the symbol with the Symbol Editor.
Pre-Layout Simulation
When you are designing with a QL5032 device, you may choose to simulate a small portion of your design, or the design as a whole. When you are simulating a small portion of the device, you may not want to include the PCI32 core within the simulation, in order to simplify the simulation vectors you need to create. Later, once you have created more design blocks, you may want to simulate the design as a whole,
including the PCI32 core. This section describes how you would preform these types of function (prelayout) simulation.
If the function you wish to simulate exists at the top level as a schematic, then the first step would be to convert that top-level schematic to an HDL file, so that the simulator can read it. This is an automatic process, which uses the Hierarchy Navigator tool. First, select the Hierarchy Navigator from the Design > Navigate Hierarchy menu command in SpDE. You will be asked to select a .TRE file. If you have never used the Hierarchy Navigator before, you wont have a .TRE file, so you should hit the NEW button, which will change the window options so that you can select the top-level Schematic (.SCH) file. Once you have selected this file, the design will be loaded into the Hierachy Navigator. From the Hierarchy Navigator, you create an HDL netlist with the Tools > Export QuickLogic menu command. This will bring up a dialog box like the one below.
Make sure you select a netlist format appropriate for the simulator you are using to simulate the design. If (as in this example) you can ONLY select Verilog, that is because there were Verilog blocks located in the Hierarchy of the design. In that case, the Silos (Verilog) simulator would need to be used for simulation. The device selection is important at this stage. If you intend to simulate with the PCI32 core, then you MUST select the QL5032-33 device. If you are simulating a small function within the design and you do not want to use the PCI32 core, then select a device from the QuickRAM (QL4xxx) family, such as the
QL4058. Also, choose the appropriate device package now, especially if you are ready to target the QL5032 with a full design. These setting will be used later (after simulation) in the compilation process. The first check box (Display busses as [L:H] in Waveform tools), is an option which you should never need to use, and is provided only for backward compatibility with older QuickLogic designs. The second check box (Preserve schematic structure through synthesis) has no affect on simulation, but it does affect the synthesis process for the design (described later in this chapter). This option need only be selected if you have carefully hand-crafted schematic functions using knowledge of the QuickLogic logic cell structure. See the QuickWorks Users Guide for more details. The final check box allows you to run a syntax check on the netlist you are creating. If you have HDL blocks within the hierarchy of your design that you have written yourself, you should check this box so that a syntax check can find problems before you simulate the design. Once you have selected all appropriate settings, click the OK button to create an HDL netlist. Once the process is finished, you will see a window like this:
You may click Done or View Messages if notes or warning messages were created during the netlisting process. Next, select File > Save from the Hierarchy Navigator so that all of your netlisting options are remembered for the next time.
10
The first two text boxes allow you to specify the location of your top level design file and test fixture (both in Verilog format. When using the QuickWorks tools, it will help to name your test-fixture with the file extension .TF. In this case, if a file is found with the same name as the design loaded into SpDE or the Hierarchy Navigator, with the .TF extension, then the first field in this window is automatically filled for you. The same is true for the second field (top level module), if there is a file with a .V extension. You may Browse your hard drive to specify the files if needed. The Add Verilog Library option allows you to specify one QuickLogic specific library to load fore the simulation. If you are simulating with the PCI32 core within the design, then the pci3233m.v file should be selected. If you chose the QL5032-33 device when you exported your Verilog netlist from the Hierarchy Navigator in Step 1 (and if you are using a top-level schematic), then this selection is already made for you, as in the window shown. Make sure that the Pre-Layout option is selected. If you have never run a post-layout simulation with the current project, then Pre-Layout will be the only option available. One you have made all selections, click OK to open the Silos Simulator.
Compilation
Now that you have completed functional simulation, the next step is to compile the QL5032 design into a QL5032 device. This Users Guide will take you through the basic steps you should follow to compile a QL5032 design. For a more comprehensive discussion of design flows and using the QuickWorks software, refer to the QuickWorks Users Guide. Since the QL5032 design contains at least two HDL blocks (the DMA Controller and the Configuration Space and Addressing block if you used the QuickLogic reference designs), you will need to synthesize the design before placing and routing. This is accomplished automatically when using QuickWorks.
11
Note: SpDE is an acronym, pronounced Spee-dee, which stands for the Seamless pASIC Design Environment. pASIC is a trademarked acronym which stands for programmable ASIC, and is pronounced p-ace-ik. To start the synthesis tool, called Synplify-Lite, you simply need to select the menu command: File > Import Verilog, or File > Import VHDL (depending on which language you selected to use for your project). Then select the top-level design name in your project, such as myproject1.v. This will open the Synplify-Lite window. Within the Synplify-Lite synthesis window you should see fields for the device name and package. Please verify that the device name is QL5032 and the package is the package you require for your project (PQ208 for a 208 PQFP package, and PB256 for a 256 pin plastic BGA package). Then click the RUN button to compile your design. If you get any errors or warnings, you can review the errors or warnings by clicking the View Log button, which will open the Turbo Writer Editor with your error file. With the error file open, you can click on the ERR button in the Turbo Writer toolbar to scroll through your errors, while the editor takes you to the line of your source file where the problems were found. If you get warnings, and you decide after reviewing the warnings that you want to continue the compilation without changing the source files, then you should manually close the Synplify-Lite window.
12
Once you verify the tools options, click Save Settings, and then Close from the Tools Options window. Then from SpDE, you can start the logic optimization, placement and routing process by going to the Tools > Run Tools menu command (or click on the Hammer in the toolbar). Make sure all tools are selected and click RUN. SpDE will then run all tools. Depending on the Placer Mode you selected (Preliminary or Quality), the size of the design, the speed of the PC, and the memory of the PC, this process can take from a couple of minutes to nearly an hour.
13
Master
Mst_WrAd[31:0] I Address for master DMA writes. This address must be treated as valid from the beginning of a DMA burst write until the DMA write operation is complete. It must be incremented (by 4) each time data is transferred on the PCI bus. Address for master DMA reads. This address must be treated as valid from the beginning of a DMA burst read until the DMA read operation is complete. It must be incremented (by 4) each time data is transferred on the PCI bus. DMA state machine in write mode. DMA state machine in read mode. Request use of the PCI bus. One data transfer remains in the burst. Two or less data transfers remain in the burst. Data for master DMA writes (to PCI bus). Data valid on Mst_WrData[31:0]. Data receive acknowledge for Mst_WrData[31:0]. Master write pipeline is empty. Data for master DMA reads (from PCI bus). Data valid on Mst_RdData[31:0]. Master read pipeline is empty. Type of PCI read command to be used for DMA reads: 00 or 01 = Memory Read 10 = Memory Read Line 11 = Memory Read Multiple Enable Latency Counter. Set to 1 to ignore the Latency Timer in the PCI configuration space (offset 0Ch). Data was transferred on the previous PCI clock. Active during the last data transfer of a PCI master transaction. The PCI REQN signal generated by the QL5032 as PCI master. The PCI IRDYN signal generated by the QL5032 as PCI master. Target abort detected during master transaction. Target timeout detected (no response from target).
Mst_RdAd[31:0]
Mst_WrMode Mst_RdMode Mst_Burst_Req Mst_One_Read Mst_Two_Reads Mst_WrData[31:0] Mst_WrData_Valid Mst_WrData_Rdy Mst_WrBurst_Done Mst_RdData[31:0] Mst_RdData_Valid Mst_RdBurst_Done Mst_RdCmd[1:0]
I I I I I I I O O O O O I
I O O O O O O
15
Target
Usr_Addr_WrData[31:0] O Target address and data from target writes. During all target accesses, the address will be presented on Usr_Addr_WrData[31:0] and simultaneously, Usr_Adr_Valid will be active. During target write transactions, this port will present write data to the PCI configuration space or user logic. PCI command and byte enables. During target accesses, the PCI command will be presented on Usr_CBE[3:0] and simultaneously, Usr_Adr_Valid will be active. During target read or write transactions, this port will present active-low byte-enables to the PCI configuration space or user logic. Indicates the beginning of a PCI transaction, and that a target address is valid on Usr_Addr_WrData[31:0] and the PCI command is valid on Usr_CBE[3:0]. When this signal is active, the target address must be latched and decoded to determine if this address belongs to the devices memory space. Also, the PCI command must be decoded to determine the type of PCI transaction. On subsequent clocks of a target access, this signal will be low, indicating that data (not an address) is present on Usr_Addr_WrData[31:0]. Indicates that the target address should be incremented, because the previous data transfer has completed. During burst target accesses, the target address is only presented to the back-end logic at the beginning of the transaction (when Usr_Adr_Valid is active), and must therefore be latched and incremented (by 4) for subsequent data transfers. This signal will be active for the duration of a target write transaction, and may be used by back-end logic to turn on outputenables for transmitting the data off-chip. Active when a user read command has been decoded from the Usr_CBE[3:0] bus. This command may be mapped from any of the PCI read commands, such as Memory Read, Memory Read Line, Memory Read Multiple, I/O Read, etc. Active when a user write command has been decoded from the Usr_CBE[3:0] bus. This command may be mapped from any of the PCI write commands, such as Memory Write or I/O Write. The address on Usr_Addr_WrData[31:0] has been decoded and determined to be within the address space of the device. Usr_Addr_WrData[31:0] must be compared to each of the valid Base Address Registers in the PCI configuration space. Also, this signal must be gated by the Memory Access Enable or I/O Access Enable registers in the PCI configuration space (Command Register bits 1 or 0 at offset 04h). Write enable for data on Usr_Addr_WrData[31:0] during PCI writes. Write enable for data on Usr_Addr_WrData[31:0] during PCI configuration write transactions. Data from the PCI configuration registers, required to be presented during PCI configuration reads. Data from the back-end user logic (and/or DMA configuration registers), required to be presented during PCI reads. Data from the Command Register in the PCI configuration space (offset 04h). Data from the Latency Timer in the PCI configuration space (offset 0Ch).
Usr_CBE[3:0]
Usr_Adr_Valid
Usr_Adr_Inc
Usr_WrReq
Usr_RdDecode
Usr_WrDecode
Usr_Select
O O I I I I
16
I I O
Cfg_SERR_Sig
Cfg_MstPERR_Det
Usr_TRDYN Usr_STOPN Usr_Devsel Usr_Last_Cycle_D1 Usr_Stop Usr_Interrupt PCI_clock PCI_reset PCI_IRDYN_D1 PCI_FRAMEN_D1 PCI_DEVSELN_D1 PCI_TRDYN_D1 PCI_STOPN_D1 PCI_IDSEL_D1
O O O O I I O O O O O O O O
Master Read Address from the DMA configuration registers. Master Write Address from the DMA configuration registers. Parity error detected on the PCI bus. When this signal is active, bit 15 of the Status Register must be set in the PCI configuration space (offset 04h). System error asserted on the PCI bus. When this signal is active, the Signalled System Error bit, bit 14 of the Status Register, must be set in the PCI configuration space (offset 04h). Data parity error detected on the PCI bus by the master. When this signal is active, bit 8 of the Status Register must be set in the PCI configuration space (offset 04h). Copy of the TRDYN signal as driven by the PCI target interface. Copy of the STOPN signal as driven by the PCI target interface. Inverted copy of the DEVSELN signal as driven by the PCI target interface. Last transfer in a PCI transaction is occurring. Used to prematurely stop a PCI target access on the next PCI clock. Used to signal an interrupt on the PCI bus. PCI clock. PCI reset signal. Copy of the IRDYN signal from the PCI bus, delayed by one clock. Copy of the FRAMEN signal from the PCI bus, delayed by one clock. Copy of the DEVSELN signal from the PCI bus, delayed by one clock. Copy of the TRDYN signal from the PCI bus, delayed by one clock. Copy of the STOPN signal from the PCI bus, delayed by one clock. Copy of the IDSEL signal from the PCI bus, delayed by one clock.
PCI
AD[31:0] CBEN[3:0] INTAN SERRN PERRN PAR REQN DEVSELN TRDYN STOPN FRAMEN IRDYN IDSEL GNTN CLK RSTN B B O O B B O B B B B B I I I I PCI Address/Data bus. PCI Command/Byte Enable bus. PCI Interrupt. PCI System Error. PCI Parity Error. PCI Parity signal for AD[31:0] and CBEN[3:0]. PCI bus request. PCI device select. PCI target ready. PCI stop. PCI frame. PCI initiator (master) ready. PCI ID select, for configuration accesses. PCI bus grant. PCI clock. PCI reset.
17
Waveforms
Single-Dword Configuration Read
On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI host controller (master) has asserted IDSEL and provided a PCI configuration register address on AD[31:0] and a configuration read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has deasserted FRAMEN to indicate that it only wishes to perform one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction. TRDYN is deasserted, indicating a wait-state inserted by the target. Clock edges 5 through 7 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 8 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to only read one register location, FRAMEN is seen deasserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer has occurred on the PCI bus. Clock edge 9 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.
18
On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI host controller (master) has asserted IDSEL and provided a PCI configuration register address on AD[31:0] and a configuration read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction. TRDYN is deasserted, indicating a wait-state inserted by the target. Clock edges 5 through 7 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 8 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to read more than one register location, FRAMEN is seen still asserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer has occurred on the PCI bus. Clock edges 9 through 12 are wait-states since TRDYN is deasserted. Since FRAMEN is deasserted on the PCI bus, only one more data transfer will take place. Prior to clock 12, the next double-word of read data must be presented to Cfg_RdData[31:0]. Clock edge 13 represents the last data transfer in this transaction. FRAMEN is deasserted, and both IRDYN and TRDYN are asserted. On the Interface Ports, Usr_Adr_Inc is active. Clock edge 14 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.
19
On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. IDSEL is active, indicating that this will be a configuration access. The value on CBEN[3:0] represents the configuration write command. The value on AD[31:0] is 10h, representing Base Address Register 0 in the PCI configuration space. Clock edge 3 is a wait state inserted by the QL5032. The QL5032 contains a medium-speed target, so it isnt able to claim PCI transactions on the clock after the address cycle (clock edge 2). Here the PCI host controller is seen driving the data that it wishes to write to the configuration register at offset 10h, along with the byte enables on CBEN[3:0]. It has also deasserted FRAMEN, indicating that it only wants to perform one data transfer. On the interface ports to the FPGA logic, the PCI core presents the address and command from clock edge 2 on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. It also asserts Usr_Adr_Valid to indicate that the value on Usr_Addr_WrData[31:0] is an address, and that a new PCI transaction is beginning. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Cfg_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. This is the last data transfer in the transaction since FRAMEN is deasserted. Clock edge 7 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction.
20
On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. IDSEL is active, indicating that this will be a configuration access. The value on CBEN[3:0] represents the configuration write command. The value on AD[31:0] is 04h, representing the Status/Command Register in the PCI configuration space. Clock edge 3 is a wait state inserted by the QL5032. The QL5032 contains a medium-speed target, so it isnt able to claim PCI transactions on the clock after the address cycle (clock edge 2). Here the PCI host controller is seen driving the data that it wishes to write to the configuration register at offset 04h, along with the byte enables on CBEN[3:0]. It has also kept FRAMEN active, indicating that it wants to perform more one data transfer. On the interface ports to the FPGA logic, the PCI core presents the address and command from clock edge 2 on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. It also asserts Usr_Adr_Valid to indicate that the value on Usr_Addr_WrData[31:0] is an address, and that a new PCI transaction is beginning. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Cfg_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. Since FRAMEN is still asserted, at least one more data transfer will occur. Clock edges 7 and 8 are wait states inserted by the PCI target in the QL5032, since TRDYN is deasserted. On clock edge 9 both IRDYN and TRDYN are asserted, indicating the target has accepted the second piece of data. Usr_Adr_Inc is asserted on the Interface Ports, indicating that the write to the PCI configuration register should take
21
place. Note that the first write occurred to offset 04h in the PCI configuration space on clock edge 6, and that the address for that register was provided at the beginning of the PCI transaction. For this second data transfer, it is implied that the write occurs to the next double-word in the PCI configuration space, offset 08h (the Class Code/Revision ID register). For this reason, the address provided at the beginning of the PCI access must be stored and automatically incremented (by 4) each time Usr_Adr_Inc is active. Note that offset 08h in the PCI configuration space is a read-only register, and that the write operation will not overwrite any data. Clock edge 10 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction.
Target Read
On clock edge 1 the PCI bus is idle because FRAMEN and IRDYN are both deasserted. Clock edge 2 represents the beginning of a new PCI transaction because FRAMEN is asserted. The PCI master provided a PCI memory address on AD[31:0] and a read command on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. As soon as Usr_Adr_Valid is active, the PCI address needs to be decoded and Usr_Select asserted if the address belongs to the QL5032s memory space. Also, Usr_RdDecode must be asserted for any target read command appearing on Usr_CBE[3:0]. Usr_RdDecode may be mapped from any of the PCI read commands, which include memory read, memory read line, and memory read multiple for memory base addresses, and I/O read for I/O base addresses. On clock edge 4 the target has asserted DEVSELN to claim the PCI transaction, since Usr_Select and Usr_RdDecode is seen asserted before this clock edge. TRDYN is deasserted, indicating a wait-state inserted by the target.
22
Clock edges 5 and 6 are also wait states, as TRDYN is deasserted on the PCI bus. On the Interface Ports, the read data is provided to Cfg_RdData[31:0] so that it can be transferred to the PCI bus. On clock edge 7 the target has provided data on AD[31:0], and has asserted TRDYN to indicate that the data is valid. Since the master intends to read more than one register location, FRAMEN is seen still asserted. On the Interface Ports, Usr_Adr_Inc has been asserted to indicate that a data transfer occurring on the PCI bus. Clock edges 8 through 10 are wait-states since TRDYN is deasserted. Since FRAMEN is deasserted on the PCI bus, only one more data transfer will take place. Prior to clock 10, the next double-word of read data must be presented to Cfg_RdData[31:0]. Clock edge 11 represents the last data transfer in this transaction. FRAMEN is deasserted, and both IRDYN and TRDYN are asserted. On the Interface Ports, Usr_Adr_Inc is active. Clock edge 12 is the turn-around cycle on the PCI bus, which must occur at the end of each PCI transaction. All PCI control signals are deasserted. On the Interface Ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in this transaction.
Target Write
On clock edge 1, the PCI bus is idle, because FRAMEN and IRDYN are both inactive. Clock edge 2 represents the beginning of a new PCI transacation, because FRAMEN is active after being inactive on the previous clock. The first PCI address for this transaction is present on AD[31:0] and the memory write command is present on CBEN[3:0]. Clock edge 3 is a wait state on the PCI bus because the target has not yet responded by asserting DEVSELN. The master has kept FRAMEN asserted to indicate that it intends to perform more than one data transfer. On the Interface Ports, Usr_Adr_Valid has been asserted, indicating that the PCI address and command are available on
23
Usr_Addr_WrData[31:0] and Usr_CBE[3:0], respectively. As soon as Usr_Adr_Valid is active, the PCI address needs to be decoded and Usr_Select asserted if the address belongs to the QL5032s memory space. Also, Usr_WrDecode must be asserted for any target write command appearing on Usr_CBE[3:0]. On clock edge 4 DEVSELN is asserted, indicating that the QL5032 has claimed the transaction. TRDYN is deasserted, indicating a wait state inserted by the target. The data value from AD[31:0] and the byte-enables from CBE[3:0] are presented to the user logic on Usr_Addr_WrData[31:0] and Usr_CBE[3:0]. Usr_WrReq has been asserted. This signal will stay asserted for the duration of the PCI transaction, and may be used to turn on outputenables if the data will be transmitted off-chip. Clock edge 5 represents another wait state on the PCI bus because TRDYN is still deasserted. Usr_Write on the internal interface of the PCI core is asserted, indicating that a write to the PCI configuration space is about to happen. On clock edge 6 both TRDYN and IRDYN are asserted, indicating that the data on AD[31:0] has been accepted by the QL5032. This event is reflected on the Interface Ports by Usr_Adr_Inc being asserted. Since FRAMEN is still asserted, at least one more data transfer will occur. Clock edges 7 and 8 are wait states inserted by the PCI target in the QL5032, since TRDYN is deasserted. On clock edge 9 both IRDYN and TRDYN are asserted, indicating the target has accepted the second piece of data. Usr_Adr_Inc is asserted on the Interface Ports, indicating that the write to the PCI configuration register should take place. Note that the first write occurred to memory location 100h on clock edge 6, and that the address for that register was provided at the beginning of the PCI transaction. For this second data transfer, it is implied that the write occurs to the next double-word in the memory space, at 104h. For this reason, the address provided at the beginning of the PCI access must be stored and automatically incremented (by 4) each time Usr_Adr_Inc is active. Clock edge 10 represents the turn-around cycle on the PCI bus, as all control signals are deasserted and then tristated. On the interface ports, Usr_Last_Cycle_D1 is asserted to indicate that no more data transfers will occur in the transaction. Usr_WrReq is deasserted after clock edge 10.
24
Previous to clock edge 1 Mst_Burst_Req and Mst_RdMode have been asserted, and a PCI address provided to the Mst_RdAd[31:0] interface port. The QL5032 responds by asserting its REQN output to request use of the PCI bus. At clock edge 2 the arbiter has granted the PCI bus to the master by asserting its GNTN input. At clock edge 3 the master is seen driving the PCI address and command on AD[31:0] and CBEN[3:0]. It is performing address stepping, so it has not yet asserted its FRAMEN signal. Clock edge 4 represents the beginning of a PCI transaction. FRAMEN is seen asserted and AD[31:0] and CBEN[3:0] are valid. Clock edge 5 is a wait-state inserted by the target because it has not yet claimed the transaction by asserting DEVSELN. The first double-word of data is read at clock edge 6 as the data is driven on AD[31:0] and both IRDYN and TRDYN are active. At clock edge 7 the second double-word of data is transferred on the PCI bus. On the interface ports, Mst_Xfer_D1 is active, indicating that data was transferred on the PCI bus on the previous clock, and that the read address should be incremented by the DMA controller. This value is shown on Mst_RdAd[31:0]. It is not used except at the very beginning of the PCI transaction, but must be kept current in case the DMA operation is interrupted and must be split across multiple PCI transactions. Clock edges 8 and 9 represent double-word transfers on the PCI bus. At clock edge 9 on the interface ports, the read data from the first data transfer is now present on Mst_RdData[31:0]. Mst_RdData_Valid is active to indicate that the data is valid.
25
Clock edges 10 through 13 represent more data transfers in this PCI transaction. At clock edge 14 Mst_Two_Reads is seen active, indicating that the DMA controller needs to perform two or fewer transfers. This causes FRAMEN to be deasserted on the PCI bus after clock edge 14. On clock edge 15 the last data transfer on the PCI bus occurs because FRAMEN is inactive while IRDYN and TRDYN are active. Mst_One_Read is active to indicate that the DMA controller needs to perform only one more transfer. This last transfer occurred on clock edge 15 on the PCI bus. Mst_Last_Cycle is active to indicate that this PCI transaction is ending. Clock edge 16 represents the turn-around cycle on the PCI bus. The read pipeline continues to be cleared through clocks 17 and 18, as data is present on Mst_RdData[31:0] and Mst_RdData_Valid is active.
26
Previous to clock edge 1 Mst_Burst_Req and Mst_WrMode have been asserted, and a PCI address provided to the Mst_WrAd[31:0] interface port. The QL5032 responds by asserting its REQN output to request use of the PCI bus. At clock edge 2 the arbiter has granted the PCI bus to the master by asserting its GNTN input. At clock edge 3 the master is seen driving the PCI address and command on AD[31:0] and CBEN[3:0]. It is performing address stepping, so it has not yet asserted its FRAMEN signal. Clock edge 4 represents the beginning of a PCI transaction. FRAMEN is seen asserted and AD[31:0] and CBEN[3:0] are valid. Clock edge 5 is a wait-state inserted by the target because it has not yet claimed the transaction by asserting DEVSELN. The first double-word of write data is presented on AD[31:0]. On the interface ports, a double-word of data is placed into the write pipeline since data is valid on Mst_WrData[31:0] and Mst_WrData_Valid and Mst_WrData_Rdy are active. Note that in the waveforms shown above, this is the third double-word that was written to the write pipeline. The first two double-words were transferred at an earlier time. On clock edge 6 the first double-word of data is transferred to the target, as IRDYN and TRDYN are both active. This causes Mst_Xfer_D1 to go active after clock edge 6. At clock edge 7 the second double-word of data is transferred on the PCI bus. On the interface ports, Mst_Xfer_D1 is active, indicating that data was transferred on the PCI bus on the previous clock, and that the write address should be incremented by the DMA controller. This value is shown on Mst_WrAd[31:0]. It is not used except at the very beginning of the PCI transaction, but must be kept current in case the DMA operation is interrupted and must be split across multiple PCI transactions.
27
Clock edges 8 and 9 represent double-word transfers on the PCI bus. On the interface ports, write data is sent into the write pipeline. Mst_WrData_Rdy is generated to acknowledge that data has been accepted, and that the next double-word of data can be sent. If there are wait-states on the PCI bus such that the write pipeline is full and is not ready to accept new data, Mst_WrData_Rdy will be inactive, as is the case on clock edge 6.
Clock edges 10 through 13 represent more data transfers in this PCI transaction. At clock edge 14 the last double-word of write data is transferred to the write pipeline. On clock edges 15 and 16 the last two data transfers on the PCI bus occur. Mst_Last_Cycle is active on clock edge 16 to indicate that this PCI transaction is ending. Clock edge 17 represents the turn-around cycle on the PCI bus.
28
2. 3.
4. 5. 6. 7. 8.
29
Inputs
Name PCI_reset PCI_clk Usr_CBE[3:0] Usr_Ad[8:2] Usr_WrData[31:0] Mst_Last_Cycle Mst_Tabort_Det Mst_Xfer_D1 Mst_TTO_Det Usr_Write BusMstEn RdRdy WrRdy LastWr Mst_WrBurst_Done Mst_RdBurst_Done Mst_WrData_Rdy Description Asynchronous Active High Reset. Buffered version of PCI Reset PCI Clock Signal Byte Enables for DataIn[31:0] PCI Address for DMA Register Writes PCI Data for DMA Register Writes Signals Last Cycle of a PCI Master Burst is in progress Target Abort Detected on a Master Transaction (Error Condition) Master has successfully transferred a DWORD to the Target Target did not respond to Master Transaction (Error Condition) Signals valid data on Usr_WrData for a DMA Register Write Bus Mastering has been enabled in the Configuration Space Ready for a Read Burst (could be Read FIFO empty) Ready for a Write Burst (could be Write FIFO full) The last DWORD has been passed for a Write (Burst must end) Indicates that the PCI Write transaction has completed Indicates that the PCI Read transaction has completed Indicates that new data is being loaded for PCI Write
Outputs
Name Mst_RdAd[31:0] Mst_WrAd[31:0] Usr_RdData[31:0] Mst_RdCmd[1:0] Mst_WrData_Valid DMA_WrEn DMA_RdEn LocalEn LatCntEn Mst_WrMode Mst_RdMode Mst_Burst_Req Mst_One_Read Mst_Two_Reads Description Current Address for DMA Reads Current Address for DMA Writes DMA Register output for PCI Target Reads Specifies type of read command to use on PCI Master Read Cycles Ready to Write new Data DMA Write Enable Register and Bus Master Enabled DMA Read Enable Register and Bus Master Enabled DMA Register Bit for Local Enable (not used in the controller) Register used to drive the Latency counter enable on the PCI core DMA Write is in progress DMA Read is in progress DMA Controller Requests the PCI bus One Reads Remains in the DMA Read Operation Two Reads Remain in the DMA Read Operation
Description of Registers
The DMA Reference Design contains many registers that have different functions. In order to aid in the understanding of the DMA Controller, and to allow edits to be made to change functionality, a detailed description of the registers and their purposes is given in Table 1. Register Name Description Default PCI Target Memory Mapped Address Not Applicable Not Applicable
RdCnt[7:0] WrCnt[7:0]
Burst Read DWORD Counter for PCI Bursts. Burst Write DWORD Counter for PCI Bursts.
30
DMAWrCnt[15:0]
DMARdCnt[15:0]
DMAWrBase[31:18] + DMAWrAdr[17:2]
DMARdBase[31:18] + DMARdAdr[17:2]
RdBCnt[7:0 (combinatorial value, not a register) WrBCnt[7:0] (combinatorial value, not a register)
Current DWORD Counts remaining in the DMA Write Transaction. Loaded by Driver to initial count, and decremented when new Write Data is written to the PCI core (Mst_WrData_Valid is high). Current DWORD Counts remaining in the DMA Read Transaction. Loaded by Driver to initial count, and decremented when a Read transaction occurs on the PCI Bus. Current Write Address for PCI Write Transactions. Start Address loaded as a DWORD address, but Read as a Byte Address. Start Address must be aligned to size of DMA transfer. Incremented by 1 DWORD when new write data is written to the PCI core (Mst_WrData_Valid). Current Read Address for PCI Read Transactions. Start Address loaded as a DWORD address, but Read as a Byte Address. Start Address must be aligned to the size of the DMA transfer. Incremented by 1 DWORD when each PCI read transaction occurs. (Mst_Xfer_D1) Holds the next value to be loaded into RdCnt. Lower of the maximum burst length (Burst_Length) or remaining Reads in DMA (DMARdCnt) Holds the next value to be loaded into WrCnt. Lower of the maximum burst length (Burst_Length) or remaining Reads in DMA (DMARdCnt)
0x100 bits 31 to 16
0x100 bits 15 to 0
0x104
0x108
Not Applicable
Not Applicable
This first section starts with a timescale directive for the Simulator, and contains Verilog comments for the description of this file, as well as version information.
module dmacntrl ( PCI_reset, // PCI Reset, active high & asynchronous PCI_clk, // PCI Clk (33 MHz) Usr_CBE, // Registered PCI CBE signals [3:0] Usr_Ad, // PCI Target Address [8:2] Usr_WrData, // Registered PCI Data Signals [31:0] Usr_RdDataIn, // Data for User Reads not decoded within this block [31:0]
31
Mst_Last_Cycle, // High only on the last data phase of a master transfer Mst_Tabort_Det, // Target aborting transfer this cycle Mst_Xfer_D1, // Delayed XFER Detected on PCI Mst_TTO_Det, // Target did not assert DEVSEL in time Usr_Write, // Write Data on PCI pins addressed to DMA Registers BusMstEn, // PCI Config Command Bit 2 (bus mastering enabled) RdRdy, // Read FIFO has room for data from PCI WrRdy, // Write FIFO is ready to send data to PCI LastWr, // End of Packet Signal from FIFO Mst_WrBurst_Done, // Write Pipeline is clear (including output register) Mst_RdBurst_Done, // Read Pipeline is clear Mst_WrData_Rdy, // Write Pipeline is ready for new data Mst_RdAd, // For Target Reads of DMA registers [31:0] Mst_WrAd, // For Target Reads of DMA registers [31:0] Usr_RdData, // For Target Reads of DMA registers [31:0] Mst_RdCmd, // Specified the PCI Read Command to Use [1:0] Mst_WrData_Valid, // In active write state (not flushing pipeline) DMAWrEn, // Software controlled enable for the write FIFO DMARdEn, // Software controlled enable for the read FIFO LocalEn, // Software controlled enable for back-end target accesses Mst_WrMode, // DMA Burst State Machine is in a Write (PCI Write) state Mst_RdMode, // DMA Burst State Machine is in a Read (PCI Read) state Mst_Burst_Req, // Tells master to assert REQN Mst_One_Read, // one read remains in burst Mst_Two_Reads, // two reads remain in burst MstRdAd_Sel, // Address for Master Read Address has been selected MstWrAd_Sel, // Address for Master Write Address has been selected Mst_LatCntEn // Enables Latency Counter for Master Transactions ); input input input input input input input PCI_reset,PCI_clk, Mst_Xfer_D1, Mst_Last_Cycle; BusMstEn, Usr_Write; [8:2] Usr_Ad; [3:0] Usr_CBE; [31:0] Usr_WrData, Usr_RdDataIn; Mst_Tabort_Det, Mst_TTO_Det, RdRdy, WrRdy; LastWr, Mst_WrBurst_Done, Mst_RdBurst_Done, Mst_WrData_Rdy; Mst_WrMode, Mst_RdMode, Mst_Burst_Req, Mst_WrData_Valid; [31:0] Mst_RdAd, Mst_WrAd, Usr_RdData; DMAWrEn, DMARdEn, LocalEn, Mst_LatCntEn; Mst_One_Read, Mst_Two_Reads, MstRdAd_Sel, MstWrAd_Sel; [1:0] Mst_RdCmd;
This section contains the standard Verilog module declaration with a list of ports, followed by the port directions. Notice the way that bus ports are declared. Also notice that the module name matches the filename. This is a recommended practice when using QuickWorks.
parameter Burst_Length = 255; // set to desired PCI burst length (0-255) // DMA Burst State Machine (one-hot) parameter idle = 5b00001; parameter dma_wr = 5b00010; parameter dma_wr_wt = 5b00100; parameter dma_rd = 5b01000; parameter dma_rd_wt = 5b10000; parameter idle_bt=0; parameter dma_wr_bt=1; parameter dma_wr_wt_bt=2; parameter dma_rd_bt=3; parameter dma_rd_wt_bt=4; reg [4:0] DMASm, NxtDMASm;
This next section of parameters is the setup for the DMA Burst State Machine, which is located and described later in this file. The Burst_Length parameter maybe set from 0 to the maximum value of the Burst_Cnt counter (255). It controls how many 0-wait state transfers make up a PCI burst while executing the DMA operation. The DMA Burst State machine has five states. The parameters declared here set up
32
five parameters to be used to set the next state, and five parameters used to check which state the state machine is in. Two registers are declared: DMASm and NxtDMASm. However, DMASm is a hardware register (made from flip-flops), while NxtDMASm is just a reg in Verilog terms, and used to prepare the next state of the state machine. The DMA state machine is set up as a one-hot state machine (1 bit high per state). So in the idle state, only bit 0 is high.
wire wire wire wire wire wire wire wire wire wire reg reg reg reg reg reg PCI_reset,PCI_clk; [8:2] Usr_Ad; [3:0] Usr_CBE; [31:0] Usr_WrData, Mst_RdAd, Mst_WrAd, DMACntReg, DMACtrlStat; [15:0] DMARdCnt,DMAWrCnt; [7:0] Burst_Cnt, RdBCnt, WrBCnt, Init_BCnt; BCnt_eq_1, BCnt_eq_2, BCnt_eq_3, Mst_One_Read, Mst_Two_Reads; Usr_Write, PCIWrEn, PCIRdEn, Mst_RdMode; RdRdy, WrRdy, LastWr, Ld_BCnt, LdWrCnt, LdRdCnt; LdRdAdrCnt, LdWrAdrCnt, DecDMARdCnt, Dec_BCnt, IncWrAdr, IncRdAdr;
[31:0] Usr_RdData; [29:16] DMARdBase,DMAWrBase; [1:0] Mst_RdCmd; Mst_WrData_Valid, MstRdAd_Sel, MstWrAd_Sel, WrCnt0, RdCnt0; DMAWrEn, DMARdEn, DMAWrErr, DMARdErr, LocalEn, Mst_LatCntEn; WrCtrl, WrRdAdr, WrWrAdr, WrDMACnt, Mst_WrMode, Mst_Burst_Req, DecDMAWrCnt;
This next section only consists of required wire and reg declarations according to the Verilog language syntax. All signals used in the file need to be declared as a reg or as a wire. See a Verilog reference book if you want to know more about these kinds of declarations.
// DMA Register Read Addressing always @(Usr_Ad or DMACtrlStat or DMACntReg or Usr_RdDataIn) casex (Usr_Ad) 7b1xxx000: Usr_RdData <= DMACntReg; // address 0x100 7b1xxx011: Usr_RdData <= DMACtrlStat; // address 0x10C default: Usr_RdData <= Usr_RdDataIn; endcase always @(Usr_Ad) begin MstWrAd_Sel <= 1b0; MstRdAd_Sel <= 1b0; casex (Usr_Ad) 7b1xxx001: MstWrAd_Sel <= 1b1; 7b1xxx010: MstRdAd_Sel <= 1b1; endcase end
The DMA Register Read Addressing section controls the outputs associated with PCI Target Reads of the DMA Registers. The first always block ports the correct register to the output (Usr_RdData) based on the current address (Usr_Ad[8:2]). If neither the Control/Stat or Read Count/Write Count register are addressed, then the DMA block routes the Usr_RdDataIn bus to the output. The designer can therefore decode other addresses and provide data to the Usr_Rd_DataIn input port of the DMA block, if additional Target Read addressing is required. A casex statement is used so that the xs in the specified address can be treated as dont cares which minimizes the logic. Care should be taken if you wish to add additional addresses in the 0x100-0x1FF range (i.e. xs should be changed to 0s). The second always block in this section sets the two outputs called MstWrAd_Sel and MstRdAd_Sel, based on the PCI Target Read Address. These outputs tell the PCI core to respond to the Target Read request with the Master Write or Master Read Address, which are present on the PCI32 input ports called Mst_WrAd and MstRdAd (respectively). In this reference design, these outputs are mapped to the addresses 0x104 and 0x108. For more information, see the PCI core technical description discussion of these ports.
// DMA Register Write Addressing always @(posedge PCI_clk or posedge PCI_reset) begin
33
if (PCI_reset) begin WrCtrl <= 0; WrRdAdr <= 0; WrWrAdr <= 0; WrDMACnt <= 0; end else begin WrCtrl <= 0; WrRdAdr <= 0; WrWrAdr <= 0; WrDMACnt <= 0; if (Usr_Write) casex (Usr_Ad) 7b1xxx011: WrCtrl <= 1; 7b1xxx010: WrRdAdr <= 1; 7b1xxx001: WrWrAdr <= 1; 7b1xxx000: WrDMACnt <= 1; endcase end end
// // // //
This always block, titled DMA Register Write Addressing, is used to create the internal control signals that will be used to write new data to the DMA Registers. WrCnrl is used to load the Control/Status Register, WrRdAdr is used to load a new Read Start Address, WrWrAdr is used to load a new Write Start Address, and WrDMACnt is used to load new DMA Read and Write Counts. A casex statement is used so that the xs in the specified address can be treated as dont cares which minimizes the logic. The write/load control signals created by this block are registered, since they will have a higher fanout.
// Load the upper (static) portion of the Write Address always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMAWrBase[29:16] <= 18b0; else begin if (WrWrAdr && !Usr_CBE[3]) DMAWrBase[29:24] <= Usr_WrData[29:24]; if (WrWrAdr && !Usr_CBE[2]) DMAWrBase[23:16] <= Usr_WrData[23:16]; end // Load the Upper (Static) portion of the Read Address always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMARdBase[29:16] <= 18b0; else begin if (WrRdAdr && !Usr_CBE[3]) DMARdBase[29:24] <= Usr_WrData[29:24]; if (WrRdAdr && !Usr_CBE[2]) DMARdBase[23:16] <= Usr_WrData[23:16]; end
This area of the DMA Controller is responsible for loading new values into the Start Address Registers (both Read and Write). Also, the lower portion of the Read and Write Address Registers are also counters, which count up by 4 each time data is transferred on the PCI bus (32-bits of data = 4 bytes). The first always block loads the upper bits of the Write Address, which will not change during the DMA Write operation. These are loaded into a Register called DMAWrBase, which is later mapped to the Mst_WrAd output port. The second always block does the exact same function, but for the Read Address. See how the correct byte enables (Usr_CBE) are checked before writing the base addresses.
// Instantiate address incrementers assign IncRdAdr = Mst_Xfer_D1 & Mst_RdMode; assign IncWrAdr = Mst_Xfer_D1 & Mst_WrMode; assign LdRdAdrCnt = (WrRdAdr & !Usr_CBE[0] & !Usr_CBE[1]); assign LdWrAdrCnt = (WrWrAdr & !Usr_CBE[0] & !Usr_CBE[1]); ucount16 RdAdrReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(IncRdAdr), .LOAD(LdRdAdrCnt), .D(Usr_WrData[15:0]), .Q(Mst_RdAd[17:2])); ucount16 WrAdrReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(IncWrAdr), .LOAD(LdWrAdrCnt), .D(Usr_WrData[15:0]),.Q(Mst_WrAd[17:2])); assign Mst_RdAd[31:18] = DMARdBase[29:16]; assign Mst_RdAd[1:0] = 2h0;
34
The first four assign statements set up the control signals for loading and incrementing the Read and Write Address Counters. The IncRdAdr signal enables the Read Address Counter to increment. This signal goes active when data transfers on the PCI bus during Master Mode, while the DMA Controller is in a read state. The IncWrAdr signal performs the same function with the Write Address Counter. The LdRdAdrCnt and LdWrAdrCnt signals are used to load the Read and Write Address Counters. They are generated from the WrRdAdr and WrWrAdr signals explained earlier, qualified with the appropriate byte enables. The next two ucount16 instantiations are the read and write address counters. The PCI data bits 15 to 0 (Usr_WrData[15:0]) are mapped to the Address Bits 17 to 2 (Mst_WrAd[17:2] or MstRdAd[17:2]). This is because the Read and Write Addresses are Loaded as DWORD addresses, but read as byte addresses. Since the lowest bit of the counters is bit 2 of the output addresses, then the addresses increment by 4 on each count. The final assign statements in this section map the base addresses (DMARDBase and DMAWrBase) to the upper bits of the read and write address busses. Also, the lower two bits of the Read and Write Addresses are assigned to be always 0, since all the PCI transfers occur on DWORD address boundaries, although the address is represented as a byte address.
// Instantiate Down Counters for DMA Read and Write Count assign LdWrCnt = WrDMACnt & !Usr_CBE[0] & !Usr_CBE[1]; assign LdRdCnt = WrDMACnt & !Usr_CBE[2] & !Usr_CBE[3]; assign DecDMARdCnt = IncRdAdr; always @(posedge PCI_clk) DecDMAWrCnt <= Mst_WrData_Rdy; dcount16 RdCntReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(DecDMARdCnt), .LOAD(LdRdCnt), .D(Usr_WrData[31:16]), .Q(DMARdCnt[15:0])); dcount16 WrCntReg (.CLR(PCI_reset),.CLK(PCI_clk),.EN(DecDMAWrCnt), .LOAD(LdWrCnt), .D(Usr_WrData[15:0]), .Q(DMAWrCnt[15:0])); assign DMACntReg = {DMARdCnt, DMAWrCnt};
This section deals with the loading and decrementing of the DMA Count Registers. They are loaded with Target Writes to address 0x100. The upper 16 bits are loaded into the Read Counter and the lower 16 bits are loaded into the Write Counter. LdRdCnt and LdWrCnt handle the loading of these two counters. The appropriate byte enables must be active when writing to these registers, as you can see in the first two assign statements. The third assign statement, and following always block create the decrement signals for the two counters. The Read Counter is decremented with the same signal used to increment the read counter, IncRdAdr. The Write Counter, however, is decremented differently. In the case of the Write Address Counter, the increment is handled by waiting for transactions to occur on the PCI bus. This is because the address must always be kept current to the PCI bus transaction. However, for writes, the data is considered committed once it is sent to the PCI core from the back end. Therefore, the decrement for the write counter uses the signal from the PCI core that indicates it requires new valid data on the current clock cycle: Mst_WrData_Rdy. See the description of the PCI32 module for more information on this signal. Two 16-bit down counters are instantiated for the Read Counter and Write Counter respectively.
// DMA Control Status Register Write always @(posedge PCI_clk or posedge PCI_reset) begin if (PCI_reset) begin DMAWrEn <= 0; DMARdEn <= 0; DMAWrErr <= 0;
35
DMARdErr <= 0; LocalEn <= 0; Mst_LatCntEn <= 0; Mst_RdCmd <= 2b01; // 01 = Memory Read end else begin // default values DMAWrErr <= DMAWrErr; DMARdErr <= DMARdErr; DMAWrEn <= DMAWrEn; DMARdEn <= DMARdEn; LocalEn <= LocalEn; Mst_LatCntEn <= Mst_LatCntEn; Mst_RdCmd <= Mst_RdCmd; if (WrCtrl && !Usr_CBE[3]) begin Mst_LatCntEn <= Usr_WrData[31]; LocalEn <= Usr_WrData[30]; Mst_RdCmd <= Usr_WrData[27:26]; DMARdErr <= Usr_WrData[25]; DMARdEn <= Usr_WrData[24]; end if (WrCtrl && !Usr_CBE[1]) begin DMAWrErr <= Usr_WrData[9]; DMAWrEn <= Usr_WrData[8]; end if ((Mst_Tabort_Det || Mst_TTO_Det) && Mst_RdMode) begin DMARdEn <= 0; DMARdErr <= 1; end if (RdCnt0 && DMARdEn == 1) DMARdEn <= 0; if ((Mst_Tabort_Det || Mst_TTO_Det) && Mst_WrMode) begin DMAWrEn <= 0; DMAWrErr <= 1; end if (WrCnt0 && DMAWrEn == 1 && Mst_WrBurst_Done) DMAWrEn <= 0; end end assign DMACtrlStat = {Mst_LatCntEn,LocalEn,2h0, Mst_RdCmd[1:0],DMARdErr,DMARdEn, 8h00,4h0, 2h0,DMAWrErr,DMAWrEn, 8h00};
This always block sets and initializes each bit in the Control/Status DMA Register. The bits included in this register are: Register LocalEn Mst_LatCntEn Bits 30 31 Function Unused in DMA Controller. Can be used as a local chip enable. Use to Enable or Disable the Master Latency Counter PCI Compliance requires this bit to be enabled, but embedded Systems may clear this bit for better PCI performance. See the PCI32 block description for more information. Used to select which PCI Read command is used in Read DMA 0x = Memory Read 10 = Memory Read Line 11 = Memory Read Multiple See the PCI32 block description for more information. Set when a Read DMA is interrupted by a Target Abort, or Target Time Out. Cleared by writing a 0 to bit 25. Write a 1 to this bit to begin a Read DMA. It is reset when the DMA completes or an error occurs. Set when a Write DMA is interrupted by a Target Abort, or Target Time Out. Cleared by writing a 0 to bit 9.
MstRdCmd[1:0]
27:26
25 24 9
36
DMAWrEn
Write a 1 to this bit to begin a Write DMA. It is reset when the DMA completes or an error occurs.
The final assign statement merges these bits into the DMA Control/Status Register.
// Create read/write enable control for state machine assign PCIWrEn = DMAWrEn && BusMstEn; assign PCIRdEn = DMARdEn && BusMstEn; // Current Burst assign BCnt_eq_3 assign BCnt_eq_2 assign BCnt_eq_1 Count Decodes = (Burst_Cnt == 3); = (Burst_Cnt == 2); = (Burst_Cnt == 1);
This section of the DMA Controller marks the boundary from the DMA register setup to the DMA Burst State Machine. The first two assignments set up the main enable signals: PCIWrEn and PCIRdEn, which are used in the DMA Burst State Machine. These equations AND together the DMA Control Registers which enable the DMA transfers, DMAWrEn and DMARdEn, with the PCI Configuration Space parameter which enables Master Transfers, BusMstEn. The next three assign statements are decodes from the Burst Counter. These decodes are used within the DMA Burst State Machine and for creating outputs. The Burst Counter keep track of how many transfers remain in the current DMA Burst, which is limited in length by the parameter Burst_Length.
// Check for end of DMA assign RdBCnt = (DMARdCnt > Burst_Length) ? Burst_Length : DMARdCnt[7:0]; assign WrBCnt = (DMAWrCnt > Burst_Length) ? Burst_Length : DMAWrCnt[7:0]; always @(posedge PCI_clk) begin if (RdBCnt == 0) RdCnt0 = 1; else RdCnt0 = 0; if (WrBCnt == 0) WrCnt0 = 1; else WrCnt0 = 0; end
The first two assign statements create the next value to be loaded into the Burst Counter. It is separated into a Read Burst Count (RdBCnt) and a Write Burst Count (WrBCnt). The way these values are initialized is that if the current DMA Counter is larger than the Burst_Length parameter, then the Burst_Length parameter is used. Otherwise, the the Burst Count is loaded from the remaining value in the DMA Counter. The always block creates two signals: RdCnt0 and WrCnt0, which go active to indicate that the Read and Write DMA transfers are complete.
// DMA State Machine always @ (DMASm or WrRdy or PCIWrEn or BCnt_eq_1 or BCnt_eq_2 or LastWr or RdRdy or PCIRdEn or WrCnt0 or RdCnt0 or Mst_RdBurst_Done or Mst_Last_Cycle or DMAWrErr or DMARdErr or Mst_WrBurst_Done or Mst_Xfer_D1) begin : StateEqns // default values to prevent loops NxtDMASm = idle; Mst_Burst_Req <= 0; Mst_WrData_Valid <= 0; if (DMASm[idle_bt]) begin // Back end ready, software enabled, and >0 transfers remain if (WrRdy && PCIWrEn && !WrCnt0) NxtDMASm = dma_wr; else if (RdRdy && PCIRdEn && !RdCnt0) NxtDMASm = dma_rd; else NxtDMASm = idle; end if (DMASm[dma_rd_bt]) begin if ((BCnt_eq_1 || (BCnt_eq_2 && Mst_Xfer_D1)) && Mst_Last_Cycle) // if 1 left, and we are in the last transfer cycle
37
NxtDMASm = dma_rd_wt;// go to tx wait state to wait for last transfer else if (DMARdErr) NxtDMASm = idle; else NxtDMASm = dma_rd; Mst_Burst_Req <= 1b1; end if (DMASm[dma_rd_wt_bt]) begin //wait for the read pipeline to clear if (Mst_RdBurst_Done || DMARdErr) NxtDMASm = idle; else NxtDMASm = dma_rd_wt; end if (DMASm[dma_wr_bt]) begin if (BCnt_eq_1 || LastWr) NxtDMASm = dma_wr_wt; else if (DMAWrErr) NxtDMASm = idle; else NxtDMASm = dma_wr; Mst_Burst_Req <= 1b1; Mst_WrData_Valid <= 1b1; end if (DMASm[dma_wr_wt_bt]) begin if (!DMAWrErr && !Mst_WrBurst_Done) begin NxtDMASm = dma_wr_wt; Mst_Burst_Req <= 1; end else begin NxtDMASm = idle; Mst_Burst_Req <= 0; end end end // State registers always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) DMASm <= idle; else DMASm <= NxtDMASm;
This is the DMA Burst State Machine. The first always block determines the value of the next state (NxtDMASm), based on the value of the current state (DMASm). The second always block transfers the next state value to the current state on the rising edge of the PCI clock. The DMA Burst State Machine consists of 5 states. The simplified state diagram is shown below.
DMAWrErr DMARdErr
idle
WrRdy and PCIWrEn and not(WrCnt0) RdRdy and PCIRdEn and not(RdCnt0)
wr
(BCnt_eq_1 or (BCnt_eq_2 and Mst_Xfer_D1)) and Mst_Last_Cycle DMAWrErr or Mst_WrBurst_Done DMARdErr or Mst_RdBurst_Done
rd
BCnt_eq_1 or Last_Wr
wr_wt
rd_wt
38
IDLE state: This is the default state. When PCI_reset is asserted, this is the state to which the state machine initializes. If the PCI Driver has enabled Master DMA Read transfers (PCIRdEn) and the local design is ready to receive read data (RdRdy) and the there are NOT zero transfers in the DMA Read Counter (~RdCnt0), then the state machine moves to the READ state. If these conditions are false, but the same conditions do exist for a Write DMA transfer, then the state machine moves to the WRITE state. Therefore, Read DMA has priority. In some applications, this may not be as desirable as the Write DMA having priority. Changing the propriety would simply involve reversing the order of the if statements within this state. READ state: In this state, the Mst_Burst_Req signal is asserted, which indicates to the PCI core that a PCI burst is requested (causes REQ# to be asserted on the PCI bus). If a DMA Read Error (DMARdErr) is detected, then the state machine will move into the IDLE state. If there is one read transfer left and the PCI core signals that the last transfer cycle has started (Mst_Last_Cycle), then the state machine moves into the READ_WAIT state. In order to detect one read remaining, the state machine looks at BCnt_eq_1 (a decode of the Burst counter which indicates that one transfer remains), or Bcnt_eq_2 AND Mst_Xfer_D1. The second term (BCnt_eq_2 && Mst_Xfer_D1) is needed because if the PCI Core is bursting read transfers at zero wait state, the Burst Counter is always one cycle behind the PCI bus. READ_WAIT state: In this state, the state machine waits for one of two events to occur. If a DMA Read Error (DMARdErr) occurs, then the state machine moves back to the IDLE state. Otherwise, the READ_WAIT state waits until the Mst_RdBurst_Done signal goes active, which indicates that the last read data for the PCI Burst has been read from the PCI bus, and has transitioned to the back end. This frees up the PCI core for a new DMA operation, so the state machine moves into the IDLE state. WRITE state: In this state, the Mst_Burst_Req signal is asserted, which indicates to the PCI core that a PCI burst is requested (causes REQ# to be asserted on the PCI bus). Also, the Mst_WrData_Valid signal is asserted while in this state. This tells the PCI core that valid write data is now ready to be transferred to the PCI Core. If a DMA Write Error (DMAWrErr) is detected, then the state machine will move into the IDLE state. If there is one write transfer left in the Burst Counter (Bcnt_eq_1), or the back end indicates that the Last Write (LastWr) data is now being sent to the PCI core, then the state machine moves to the WRITE_WAIT state. WRITE_WAIT state: If a DMA write error (DMAWrErr) occurs, or the PCI core indicates that the last write data has been written to the PCI bus (Mst_WrBurst_Done), then the state machine will transfer from this state into the IDLE state. Otherwise, the state machine will wait here for the final data elements in the PCI core write pipeline to get written to the PCI bus. This allows the PCI Core to free up its internal datapath before moving to a new PCI burst transfer. One important situation in this state involves the Mst_Burst_Req signal. This signal must be kept active until the Mst_WrBurst_Done signal is detected. The reason for this is that the PCI Target may ask for a transaction to be retried, so the DMA Controller must continue to request burst transfers on the PCI bus until the last data has transferred. Also, once Mst_WrBurst_Done has been detected, the Mst_Burst_Req signal must be set to 0 on the same clock cycle. It is important not to request a new PCI burst transaction because all DMA operations may be complete.
// DMA assign assign assign Burst Counter Init_BCnt = (NxtDMASm[dma_wr_bt]) ? WrBCnt : RdBCnt; Ld_BCnt = DMASm[idle_bt]; Dec_BCnt = (DMASm[dma_rd_bt] & Mst_Xfer_D1) || Mst_WrData_Rdy;
This area of the DMACNTL.V file describes the interface with the Burst Counter, which uses the Burst_Cnt bus as its output. The loading of new Burst Counts (controlled by Ld_BCnt) happens while the DMA Controller is in the IDLE state (DMASm[idle_bt]). The Init_Bcnt bus holds the data to be loaded into the Burst Counter. It is chosen between the read (RdBCnt) and write (WrBCnt) counts by
39
looking at the next state for the DMA burst state machine (NxtDmaSm[dma_wr_bt]). The Burst Counter is decremented whenever a PCI transfers occurs (Mst_Xfer_D1) while in the READ state (DMASm[dma_rd_bt]), or whenever Mst_WrData_Rdy is asserted by the PCI Core (which happens when it is ready for new write data while the DMA Controller is in a WRITE state). Since the Burst Counter is decremented in read mode with the Mst_Xfer_D1 signal (which indicates that a PCI transfer occurred on the previous clock), the Burst Counter will be one count behind on writes, while a 0-wait state burst is occurring on the bus.
// DMA Burst State Machine Outputs assign Mst_One_Read = (BCnt_eq_1 | (BCnt_eq_2 && Mst_Xfer_D1)) && DMASm[dma_rd_bt]; assign Mst_Two_Reads = (BCnt_eq_2 | (BCnt_eq_3 && Mst_Xfer_D1)) && DMASm[dma_rd_bt]; assign Mst_RdMode = DMASm[dma_rd_bt] | DMASm[dma_rd_wt_bt]; always @(posedge PCI_clk or posedge PCI_reset) if (PCI_reset) Mst_WrMode <= 1b0; else Mst_WrMode <= NxtDMASm[dma_wr_bt] | NxtDMASm[dma_wr_wt_bt]; decoded Mst_WrMode endmodule
// Pre-
The last section of the DMACNTL module makes up the outputs of the DMA Controller which are generated from the DMA State Machine and Burst Counter. Mst_One_Read indicates that one read remains in the DMA Read Burst Operation. Mst_Two_Reads indicates that Two Reads remain in the DMA Burst Read operation. The PCI Core needs these signals to properly end zero-wait state burst reads. The Mst_RdMode and Mst_WrMode signals indicate to the PCI core that the Burst operation is a read or a write. The Mst_Wr_Mode signal is pre-decoded and registered in order to improve cycle time.
40
These port and signal declarations define the input and output ports of CFGTADDR that relate to the PCI configuration space. CBE[3:0] WrData[31:0] Cfg_Write PCI_clock MstPERR_Det PERR_Det TTO_Det PCI_reset SERR_Sig Tabort_Det CfgData[31:0] Byte enables, used during PCI configuration writes. Write data, used during PCI configuration writes. Configuration write-enable, active during PCI configuration write transactions. PCI clock. Parity error detected by the QL5032 bus master. When this signal is active, bit 8 of the Status register must be set. Parity error detected. When this signal is active, bit 15 of the Status register must be set. Received Master Abort. When this signal is active, bit 13 of the Status register must be set. PCI system reset. Signalled system error. When this signal is active, bit 14 of the Status register must be set. Received target abort. When this signal is active, bit 12 of the Status register must be set. Configuration data output, multiplexed based on the current PCI address (stored in the address register/counter, which is described in the following section). Used during configuration read transactions. Copy of the Command register in the configuration space. Copy of the Latency Timer in the configuration space.
CmdReg[15:0] LatTimerReg[7:0]
41
The following lines of Verilog declare the internal 32-bit busses that are generated from the individual PCI registers. These are multiplexed to drive the CfgData[31:0] output port, which is used during PCI configuration read transactions.
// *** Full wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] wire [31:0] 32-bit wide PCI registers Dev_Vend; // Stat_Cmd; // Class_RevID; // BIST_Hdr_Lat_Cache; // BAR0; // SubsysID_SubsysVendID; // Lat_Gnt_IntPin_IntLine; // offset 00h 04h 08h 0Ch 10h 2Ch 3Ch
The next section of Verilog code represents values most likely to be modified by users. These represent numbers and values that will uniquely identify the device, along with the properties and capabilities of the device. Please refer to the PCI specification for detailed information about these registers.
// *********** beginning of user-modifiable parameters ************ // PCI registers offset into config space wire [15:0] DeviceID = 16h0001; // 00h wire [15:0] VendorID = 16h11E3; // 00h wire [23:0] ClassCode = 24h020000; // 08h wire [7:0] RevisionID = 8h01; // 08h wire [15:0] SubsysID = 16h0001; // 2Ch wire [15:0] SubsysVendID = 16h11E3; // 2Ch wire [7:0] MaxLat = 8h05; // 3Ch wire [7:0] MinGnt = 8h02; // 3Ch wire [7:0] IntPin = 8h01; // 3Ch parameter BAR0_size = 24; // Sets the size of the requested memory space. // Default value is 24, corresponding to 16MB. // (# of bits to tie off in the BAR) // *********** end of user-modifiable parameters ************
The remainder of the Verilog code that describes the PCI configuration space will not need to be modified by the user in most cases. However, there are a few exceptions.
to:
reg IOEnable;
42
only BAR0 (offset 10h) is used. BAR1 through BAR5 may be added, provided the appropriate changes are made to the Verilog source. The example below will show the user how to add an additional base-address, BAR1. Add a wire declaration for the new base-address:
wire [31:0] BAR0; wire [31:0] BAR1; //added
Declare a new register for BAR 1 and map it to the BAR1 wire declared earlier:
reg [31:0] BAR0_reg; reg [31:0] BAR1_reg; //added assign #1 BAR0 = BAR0_reg; assign #1 BAR1 = BAR1_reg; //added
Create the write-able for BAR 1. Be sure to properly decode the address:
wire BAR0WE = (Cfg_Write & (!CBE[3]) & (UsrAddr[4] & !UsrAddr[2] & !UsrAddr[3] & !UsrAddr[5] & !UsrAddr[6] & !UsrAddr[7] & !UsrAddr[8])); wire BAR1WE = (Cfg_Write & (!CBE[3]) & (UsrAddr[4] & UsrAddr[2] & !UsrAddr[3] & !UsrAddr[5] & !UsrAddr[6] & !UsrAddr[7] & !UsrAddr[8])); //added
Insert the following lines that describe how BAR 1 will be written to during PCI configuration write transactions:
always @(posedge PCI_clock or posedge PCI_reset) begin if (PCI_reset) BAR1_reg <= 0; else if (!BAR1WE) begin BAR1_reg[31:BAR1_size] <= BAR1_reg[31:BAR1_size]; BAR1_reg[BAR1_size-1:0] <= 0; end else begin BAR1_reg[31:BAR1_size] <= WrData[31:BAR1_size]; BAR1_reg[BAR1_size-1:0] <= 0; end end
Next, make sure the new base-address is properly muxed into CfgData[31:0]:
always @(posedge PCI_clock) begin case (selcfg) 7'b0000000: CfgData <= Dev_Vend; 7'b0000001: CfgData <= Stat_Cmd; 7'b0000010: CfgData <= Class_RevID; 7'b0000011: CfgData <= BIST_Hdr_Lat_Cache; 7'b0000100: CfgData <= BAR0; 7'b0000101: CfgData <= BAR1; //added 7'b0001100: CfgData <= 32'h0; 7'b0001111: CfgData <= Lat_Gnt_IntPin_IntLine; default: CfgData <= 32'h0; endcase end
43
The last step is to make sure that an address hit is properly determined, and now accounts for the new base-address register. The following line must be changed from:
assign Addr_Hit = (MemEnable && (WrData[31:24] == BAR0[31:24]));
to:
assign Addr_Hit = (MemEnable && ( (WrData[31:BAR0_size] == BAR0[31: BAR0_size]) || (WrData[31:BAR1_size] == BAR1[31:BAR1_size])));
Address Register/Counter
The address register/counter is simply a loadable counter. It is capable of latching the PCI address and holding it, and it is also capable of incrementing the address by 4 at the completion of a PCI data transfer.
always @(posedge PCI_clock or posedge PCI_reset) if (PCI_reset) UsrAddr[23:0] <= 0; else if (LoadAddr) UsrAddr[23:0] <= WrData[23:0]; else if (IncrAddr) begin UsrAddr[23:10] <= UsrAddr[23:10]; UsrAddr[9:2] <= UsrAddr[9:2] + 1; UsrAddr[1:0] <= UsrAddr[1:0]; end
Additionally, this section of Verilog code determines when the QL5032 should claim a PCI target transaction. If the address sent at the beginning of a PCI transaction belongs to one of the base-addresses implemented in the device, the Addr_Hit signal must go active. The following equation performs this function:
assign Addr_Hit = (MemEnable && (WrData[31:BAR0_size] == BAR0[31:BAR0_size]));
44
Online Resources
Help with QuickLogic Software and Devices Help with PCI Specification and Protocol Online FAQ for the QL5032 device www.quicklogic.com/support www.pcisig.com www.quicklogic.com/support/ql5032
Telephone Support
QuickLogic Customer Engineering Hotline (408) 990-4100
Answer:
Question: Answer:
Question: Answer:
Question:
45
Answer:
Yes. The DMA controller can set the type of read operation on the PCI bus with the Mst_Rd_Sel[1:0] pins. See the Appendix titled Functional Description of the QL5032 PCI Controller in the QL5032 Users Guide for how to use these pins. What is the maximum burst speed for the QL5032? The QL5032 will send and receive data with zero wait states, assuming the target it is talking to can also achieve that speed. The QL5032 fully supports Memory Read, Memory Read Line, and Memory Read Multiple commands so optimal read performance can be achieved on a variety of motherboards. In Target Mode, the QL5032 is only used for DMA configuration and PCI Configuration, so full burst (0-wait state) operation is not needed in Target mode. All high performance reads and writes on the PCI bus should therefore be executed in PCI bus Master mode. Target transactions will operate with two to four wait states inserted by the QL5032 device. What is the Target Latency of the QL5032? The table below shows the Target latency in terms of initial latency (assertion of FRAMEN by a master until TRDY is asserted the first time by the QL5032 Target) and subsequent latency (wait states between subsequent assertions of TRDYN by the QL5032 Target). In high performance applications, the target interface of the QL5032 is primarily intended for PCI and DMA configuration. Transaction Type Configuration Read Configuration Write Memory/IO Read Memory/IO Write Target Initial Latency (FRAME to TRDY) 6 4 5 4 Target Wait States (Target Subsequent Latency) 4 2 3 2
Question: Answer:
Question: Answer:
Question: Answer:
Does the QL5032 generate and/or receive Type 1 Configuration commands? The QL5032 can be configured to recognize Type 1 Configuration commands (by default, however, it will ignore them), but it can not generate Type 1 Configuration command cycles in Master mode. These commands are only required of applications which are a PCI bus host, or a PCI to PCI bridge. Can I control the Byte Enables on the PCI bus during Master transactions? The 4 CBE signals are always active (set to 0) during PCI Master transactions, so you must transfer data on an aligned DWORD boundary. This was done to optimize the size and speed of the QL5032 device. Can the QL5032 act as a PCI host? Since this would require the Master interface to generate PCI Configuration Read and Write commands (Type 0 and Type 1), the QL5032 does not have this capability. The Master Controller can only generate Mem Read, Mem Read Line, Mem Read Multiple, and Mem Write commands. What are the limitations of the local clock on the PCI add-in card, when I use the QL5032 as the PCI bus interface. The QL5032 allows the user to generate the local clock by passing the PCI clock through to an output pin, or the user may create a local clock on the board. The synchronization between these clocks is best accomplished by asynchronous FIFOs created in the programmable logic region of the device. For a definition of asynchronous FIFOs in this context, see the FIFO chapter of the QL5032 Users Guide. Any local clock speed from 0 to 175 MHz can be used, depending on the complexity of the logic that the local clock drives in the QL5032 device. The local clock does not need to have any frequency or phase relationship to the PCI clock. The designer may choose to use the PCI clock as the local clock, or use an independent local clock.
Question: Answer:
Question: Answer:
Question: Answer:
46
Question: Answer:
Does the QL5032 have any problems with Special Cycles or Master-Aborts on the PCI bus? No, the QL5032 responds correctly to all Special Cycles and Master Aborts which occur on the PCI bus. With which motherboards has the QL5032 been tested? The QL5032 device should work with all PCI compliant motherboards. It has currently been tested (not exhaustively) with a Pentium Pro 200 system running Windows NT 4.0, a Dell GXa (Pentium II), a CTX (Celeron), and a Compaq Server 1600 (500 MHz Pentium). For a more complete list of the motherboards, chipsets, and BIOS versions tested with the QL5032, see the QL5032 support page (www.quicklogic.com/support/QL5032). How was the QL5032 tested? A comprehensive simulation was performed both functionally and with timing. This simulation simulated other agents on the bus and watched for bus protocol violations. A reference board was created and a thorough set of tests was performed with the HP E2925B PCI Analyzer. All tests in the PCI Compliance checklist were performed to verify PCI compliance. Also, the QL5032 was taken to a PCI Special Interest Group PCI Compliance Workshop, where it was tested with several other manufacturers motherboards. [These comments will be valid upon completion of the internal verification plan, scheduled for completion by 4/9/99]. Does the QL5032 support unlimited Master bursts? Yes, in two ways. First the PCI Master can continue bursts for as long as its latency timer does not time out and the bus is still granted to it. Second, the latency timer can be disabled (which would violate PCI spec but may be OK for an embedded system), so that the burst may continue as long as the back end is ready to send/receive data. The DMA Controller reference design can be customized for any burst length and maximum DMA length, to optimize the bandwidth requirements of an application. What if my back end design is slow during Master Transactions? Will the Master insert wait states to compensate for the data being slow? No. The Master will end the current burst as soon as the back end is no longer ready to send or receive data at the proper rate. It is therefore recommended with slow back-ends to fill a FIFO with data before initiating a Master Write, and provide an empty FIFO before initiating Master reads, in order to provide the best performance on the PCI bus. 64 deep/32 wide FIFOs are relatively inexpensive to implement within the QL5032, so this should not be a limitation for any application. What values should I use for Device ID, Vendor ID, Subsystem ID, and Subsystem Vendor ID? The Vendor ID and Subsystem ID are supplied by the PCI Special Interest Group (PCI SIG www.pcisig.com), when you sign up to be a member. This is a necessary step for PCI device and board designers. The Vendor ID should be the Vendor ID assigned to the designer of the device. In the case of the QL5032, the Vendor ID should belong to the company which is designing and customizing the QL5032 device. The Subsystem Vendor ID should belong to the company which builds the PCI add-in board which uses the QL5032. In some cases, this may be the same company. In that case, the same Vendor ID may be used in both fields. Device ID and Subsystem ID are numbers which should be unique to each design, but which are managed by the company doing the design, not by the PCI SIG. In the PCI 2.1 Specification, the Subsystem ID and Subsystem Vendor ID were optional. In the PCI 2.2 Specification, these values in the configuration space are now mandatory. Can the QL5032 be used as a PCI to PCI Bridge? No, since the QL5032 will not respond to Type 1 Configuration Accesses or Generate Type 1 or Type 2 Configuration Accesses, it cannot be used as a transparent PCI to PCI bridge. If
Question: Answer:
Question: Answer:
Question: Answer:
Question: Answer:
Question: Answer:
Question: Answer:
47
your application just needs to send data from one PCI bus to another, then a limited PCI to PCI bridge function may be possible. The solution would require custom drivers, with each PCI bus having independent configurations. Two QL5032s would be needed, one for each bus, and each would have memory space assigned to it on each bus by the configuration master. Question: Answer: How does the speed of the FPGA portion of the QL5032 compare with QuickRAM devices from QuickLogic? The QL5032 has two speed grades: A and B. The A grade is similar in speed to a -3 device in the QuickRAM family. The B grade is similar in speed to a -4 device in the QuickRAM family. How many BARs (Base Address Registers) can be used in the QL5032. Up to six 32-bit BARs may be used in the QL5032. The BAR configuration is handled within the Target Configuration and Addressing module, which is described in the Appendix of the QL5032 Users Guide. The default implementation is one BAR, with a 16 MByte address range. Does the QL5032 support mailbox queues and I20? Mailboxes are the PCI-interface level requirement to support I20. Mailboxes can be used for a variety of other purposes as well. One way of describing mailboxes is as a set of registers that allow two-way communication between a back-end processor and the system software on the PCI bus. Mailbox registers and doorbells can be implemented in the user-side of the FPGA using a state machine, logic cell registers, and RAM cells. For full I20 compliance, a processor is needed on the same card as the QL5032, with a significant amount of RAM. See the I20 specification for more information. You can obtain an I2O specification from the I2O SIG (Special Interest Group). Go to www.i2osig.org for more information. Does the QL5032 support the new PCI-X standard. What is PCI-X? PCI-X, initally proposed by Compaq, Hewlett-Packard and IBM, is a proposed extension to the PCI local bus specification to potentially deliver increased bandwidth and bus performance, running at speeds up to 133MHz. This proposal is aimed at addressing I/O bandwidth for servers running enterprise applications such as Gigabit Ethernet, Fiber Channel and Ultra3 SCSI. QuickLogic is very interested in PCI-X and is continually monitoring the progress and information being released on PCI-X. Since there is limited information on PCIX and there has been no preliminary specification on PCI-X that has been released to the public, we are unable to speak directly about PCI-X at this time. However, QuickLogic is committed to being the leading supplier of ESP products and drive to provide solutions to the leading edge system requirements. Does the QL5032 support Compact PCI or Hot Swapping? The QL5032 can support Compact PCI. The main differences between PCI and CompactPCI are mechanical, and the fact that CompactPCI supports Hot Swapping. Contact QuickLogic for the availability of an application note and reference design.
Question: Answer:
Question: Answer:
Question: Answer:
Question: Answer:
48
Glossary
BAR
Base Address Register. Refers to the 32-bit Registers in the PCI Configuration Space at offset 0x10 through 0x27. Up to six Base Address Registers may be defined for a given PCI device. The Base Address Register is a writeable register. The PCI system software (BIOS) first determines how much memory space an address register needs, and then assigns a memory address to each base address register. The PCI device must respond to addresses that fall within its base address range. The QL5032 may have from one to six BARs. The default is one BAR, with a 16 MB address range (in the Target Configuration and Addressing referencemodule).
BGA
Ball Grid Array. This refers to the Plastic Ball Grid Array (PBGA) packages which QuickLogic offers in many of its device families.
DMA
Direct Memory Addressing. DMA describes a hardware system independent of the primary processor that can carry out memory addressing. Usually, a DMA state machine will be configured with a source address, destination address, and size, and move data from one location to another. In the case of the QL5032, it refers to the DMA Controller Reference module, which can be configured through the PCI Interface with a PCI Address, Size of Transfer, and Type of Transfer (read or write). The DMA controller will independently control the PCI Interface to complete the transfer of data to or from the PCI bus.
DWORD
Double-Word. Refers to a 32-bit piece of data.
FIFO
First-in, First-out. Refers to a memory that has no direct addressing. The user pushes data elements into the memory and pops data elements out of the memory. The internal addressing structure is designed so that the first element of data that is pushed into the memory is the first piece of data that is retrieved when a pop is performed (and so on).
FPGA
Field-programmable gate array. This is a device which consists of an array of general-purpose digital logic cells or modules interconnected by wires. The wires are designed with interconnections that can be programmed by a designer in the field. This is as opposed to ASICs, which must be programmed or masked within a factory. QuickLogic provides many FPGA families, and the programmable region of the QuickPCI family of devices is sometimes referred to as the FPGA region or FPGA side.
GLCK
Global clock. A GCLK buffer drives a pre-buffered network on QuickLogic devices, which offers very high fanout nets with very low latency and skew. Therefore, GCLKs are ideal for clock distribution. However, GCLKs also can be routed to I/O buffer output enables, the asynchronous sets and resets of all flip-flops, the F1 input of each logic cell, and both the read and write clocks of RAM modules. This makes them very flexible. QuickLogic devices offer between 4 and 8 GCLK networks. Often, a subset (usually 2) can be accessed directly from pins on the device (labeled GCLK/I pins).
49
HDL
Hardware Description Language. This term is most often used in this Users Guide to represent both Verilog and VHDL with one term.
I/O
Input/Output. This refers to a pin on the device.
LFSR
Linear Feedback Shift Register. This is a type of counter which is designed as a shift register where certain bits (called taps) are fed back into an XOR gate, often into the first bit of the shift register chain. If the correct taps are used, the LFSR will so through a long series of non-repeated states, and start again at the first state. Because of the simple design, LFSRs can be used as high-speed counters. However, because the states do not progress in an intuitive pattern (like a binary counter), the decode logic can be more complex. QuickLogic uses LFSRs for its high-speed FIFOs.
pASIC
Programmable ASIC. QuickLogic uses this term for its programmable devices. Since the Via-Link interconnect technology makes the QuickLogic devices look more like ASICs than traditional FPGAs, this acronym was created to describe QuickLogic programmable devices.
PBGA
Plastic Ball Grid Array. See BGA or Ball Grid Array.
PCI
Peripheral Component Interconnect.
PCI SIG
PCI Special Interest Group. See their Web site at www.pcisig.com.
PQFP
Plastic Quad Flat Pack. A device package offered by QuickLogic which is surface mounted on the PCB (printed circuit board).
RAM
Random Access Memory. QuickLogics QuickRAM and QuickPCI families have modules of programmable RAM, which consist of up to 1152-bits of RAM per module. See the appropriate device family datasheet for details.
ROM
Read-only Memory. QuickLogics RAM modules can be configured as ROMs by attaching a Serial EEPROM to the QuickLogic device. At power-up, the data from the Serial EEPROM will be loaded into the QuickLogic RAM modules, allowing them to be used in a ROM fashion by the design. See QuickNote 65 on the QuickLogic Web site for more information.
SpDE
Seamless pASIC Design Environment. This refers to the application which controls the design flow in the QuickWorks, QuickChip, QuickTools, and QuickTools Plus packages from QuickLogic.
50