Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 99

Building Multi-Processor FPGA Systems

Hands-on Tutorial to Using FPGAs and Linux


Chris Martin <cmartin@altera.com>
Member Technical Staff Embedded Applications

Agenda
Introduction
Problem: How to Integrate Multi-Processor Subsystems
Why
Why would you do this?
Why use FPGAs?
Lab 1: Getting Started - Booting Linux and Boot-strapping NIOS
Lab 2: Inter-Processor Communication and Shared Peripherals
Lab 3: Locking and Tetris
Building Hardware: FPGA Hardware Tools & Build Flow
Building/Debugging Software: Software Tools & Build Flow
References
Q&A All through out.
2

The Problem Integrating Multi-Processor Subsystems

Given a system with


multiple processor subsystems, these
architecture decisions
must be considered:
Inter-processor
communication
Partitioning/sharing
Peripherals (locking required)
Bandwidth & Latency
Requirements
3

Periph 1

Processor

Subsystem
1

Periph 2

Periph 3

???
Periph 1

Processor

Subsystem
2

Periph 2

Periph 3

Why Do We Need to Integrate Multi-Processor


Subsystems?
May have inherited processor
subsystem from another development
team or 3rd party
Risk Mitigation by reducing change

Fulfill Latency and Bandwidth


Requirements
Real-time Considerations
If main processor not Real-Time enabled,
can add a real-time processor subsystem

Design partition / Sandboxing


Break the system into smaller
subsystems to service task
Smaller task can be designed easily

Leverage Software Resources


Sometimes problem is resolved in less
time by Processor/Software rather than
Hardware design
Sequencers, State-machines
4

Why do we want to integrate with FPGA?


(or rather, HOW can FPGAs help?)
Bandwidth & Latency can be
tailored
Addresses Real-time aspects of
System Solution
FPGA logic has flexible
interconnect
Trade Data width with clock
frequency with latency

Experimentation
Many processor subsystems can
be implemented
Allows you to experiment changing
microprocessor subsystem
hardware designs
Altera FPGA under-the-hood
However: Generic Linux
interfaces used and can be
applied in any Linux system.
5

Simple Multiprocessor System


A
Peripheral
ARM
Shared
Peripheral

Mailbox
NIOS

N
Peripheral

And, why is Altera involved


with Embedded Linux

Why is Altera Involved with Embedded Linux?

Design Starts

With Embedded Processor


Without Embedded Processor

50%

Source: Gartner September 2010

More than 50% of FPGA designs include an embedded processor, and growing.
Many embedded designs using Linux
Open-source re-use.
Altera Linux Development Team actively contributes to Linux Kernel
6

SoCKit Board Architecture Overview

Lab focus
UART
DDR3
LEDs
Buttons

SoC/FPGA Hardware Architecture Overview


DDR

ARM-to-FPGA
Bridges
Data Width

configurable

A9
I$

A9
D$

I$

D$

L2

FPGA

EMIF

DMA

ROM

UART

RAM

SD/MMC

42K Logic

Macros
Using no more
than 14%

AXI Bridge
AXI Bridge
HPS2FPGA
LWHPS2FPGA
32/64/128
32

AXI Bridge
FPGA2HPS
32/64/128

SYS ID
RAM
FPGA Fabric
Soft Logic
8

GPIO
32

NIOS

Lab 1: Getting Started


Booting Linux and Boot-strapping NIOS
Topics Covered:

Configuring FPGA from SD/MMC and U-Boot


Booting Linux on ARM Cortex-A9
Configuring Device Tree
Resetting and Booting NIOS Processor
Building and compiling simple Linux Application

Key Example Code Provided:


C code for downloading NIOS code and resetting NIOS from ARM
Using U-boot to set ARM peripheral security bits

Full step-by-step instructions are included in lab manual.

Lab 1: Hardware Design Overview


NIOS Subsystem
1 NIOS Gen 2 processor
64k combined instruction/data
RAM (On-Chip RAM)
GPIO peripheral

Subsystem 1
SD/MMC

EMIF
Cortex-A9
UART

ARM Subsystem

2 Cortex-A9 (only using 1)


DDR3 External Memory
SD/MMC Peripheral
UART Peripheral

RAM
NIOS 0
GPIO
Subsystem 2

Shared Peripherals
10

Dedicated Peripherals

Lab1: Programmer View - Processor Address Maps

NIOS

11

ARM Cortex-A9

Address Base

Peripheral

Address Base

Peripheral

0xFFC0_2000

ARM UART

0xFFC0_2000

UART

0x0003_0000

GPIO (LEDs)

0xC003_0000

GPIO (LEDs)

0x0002_0000

System ID

0xC002_0000

System ID

0x0000_0000

On-chip RAM

0xC000_0000

On-chip RAM

Lab 1: Peripheral Registers

12

Peripheral Address
Offset

Access

Bit Definitions

Sys ID

0x0

RO

[31:0] System ID.


Lab Default = 0x00001ab1

GPIO

0x0

R/W

[31:0] Drive GPIO output.


Lab Uses for LED control, push button status
and NIOS processor resets (from ARM).
[3:0] - LED 0-3 Control.
0 = LED off . 1 = LED on
[4] NIOS 0 Reset
[5] NIOS 1 Reset
[1:0] Push Button Status

UART

0x14

RO

Line Status Register


[5] TX FIFO Empty
[0] Data Ready (RX FIFO not-Empty)

UART

0x30

R/W

Shadow Receive Buffer Register


[7:0] RX character from serial input

UART

0x34

R/W

Shadow Transmit Register


[7:0] TX character to serial output

Lab 1: Processor Resets Via Standard Linux GPIO


int main(int argc, char** argv)
Interface
{
int fd, gpio=168;
char buf[MAX_BUF];

/* Export: echo ### > /sys/class/gpio/export */


fd = open("/sys/class/gpio/export", O_WRONLY);
sprintf(buf, "%d", gpio);
write(fd, buf, strlen(buf));
close(fd);

NIOS resets
connected to GPIO

/* Set direction to Out: */


/* echo "out > /sys/class/gpio/gpio###/direction */
sprintf(buf, "/sys/class/gpio/gpio%d/direction", gpio);
fd = open(buf, O_WRONLY);
write(fd, "out", 3); /* write(fd, "in", 2); */
close(fd);

GPIO driver uses


/sys/class/gpio
interface

/* Set GPIO Output High or Low */


/* echo 1 > /sys/class/gpio/gpio###/value */
sprintf(buf, "/sys/class/gpio/gpio%d/value", gpio);
fd = open(buf, O_WRONLY);
write(fd, "1", 1); /* write(fd, "0", 1); */
close(fd);
/* Unexport: echo ### > /sys/class/gpio/unexport */
fd = open("/sys/class/gpio/unexport", O_WRONLY);
sprintf(buf, "%d", gpio);
write(fd, buf, strlen(buf));
close(fd);

13

Lab 1: Loading External Processor Code


Via Standard Linux shared memory (mmap)

NIOS RAM address


accessed via mmap()
Can be shared with
other processes
R/W during load
Read-only protection
after load

/* Map Physical address of NIOS RAM


to virtual address segment
with Read/Write Access */
fd = open("/dev/mem", O_RDWR);
load_address = mmap(NULL, 0x10000,
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xc0000000);
/* Set size of code to load */
load_size = sizeof(nios_code)/sizeof(nios_code[0]);
/* Load NIOS Code */
for(i=0; i < load_size ;i++)
{
*(load_address+i) = nios_code[i];
}
/* Set load address segment to Read-Only */
mprotect(load_address, 0x10000, PROT_READ);
/* Un-map load address segment */
0x10000);
munmap(load_address,

14

Lab 2: Mailboxes
NIOS/ARM Communication
Topics Covered:
Altera Mailbox Hardware IP

Key Example Code Provided:


C code for sending/receiving messages via hardware Mailbox IP
NIOS & ARM C Code

Simple message protocol


Simple Command parser

Full step-by-step instructions are included in lab manual.


User to add second NIOS processor mailbox control.

15

Lab 2: Hardware Design Overview


NIOS 0 & 1 Subsystems
NIOS Gen 2 processor
64k combined instruction/data
RAM
GPIO (4 out, LED)
GPIO (2 in, Buttons)
Mailbox

ARM Subsystem

2 Cortex-A9 (only using 1)


DDR3 External Memory
SD/MMC Peripheral
UART Peripheral

Subsystem 1
SD/MMC
Cortex-A9

UART
GPIO
MBox

RAM

NIOS 0
GPIO
Subsystem 2

Shared Peripherals
16

EMIF

MBox

RAM

NIOS 1
GPIO
Subsystem 3
Dedicated Peripherals

Lab2: Programmer View - Processor Address Maps

NIOS 0 & 1

17

ARM Cortex-A9

Address Base

Peripheral

Address Base

Peripheral

0xFFC0_2000

ARM UART

0xFFC0_2000

UART

0x0007_8000

Mailbox (from ARM)

0x0007_8000

Mailbox (to NIOS 1)

0x0007_0000

Mailbox (to ARM)

0x0007_0000

Mailbox (from NIOS 1)

0x0005_0000

GPIO (In Buttons)

0x0006_8000

Mailbox (to NIOS 0)

0x0003_0000

GPIO (Out LEDs)

0x0006_0000

Mailbox (from NIOS 0)

0x0002_0000

System ID

0xC003_0000

GPIO (LEDs)

0x0000_0000

On-chip RAM

0xC002_0000

System ID

0xC001_0000

NIOS 1 RAM

0xC000_0000

NIOS 0 RAM

Lab 2: Additional Peripheral (Mailbox) Registers

Peripheral Address
Offset

Access

Bit Definitions

Mailbox

0x0

R/W

[31:0] RX/TX Data

Mailbox

0x8

R/W

[1] RX Message Queue Has Data


[0] TX Message Queue Empty

18

LAB 2: Designing a Simple Message Protocol

Design Decisions:
Short Length: A single 32-bit word
Human Readable
Message transactions are closed-

loop. Includes ACK/NACK

Format:
Message Length: Four Bytes
First Byte is ASCII character

19

Byte 0

Byte 1 Byte 2

Byte3

\0

\0

Message Types:
G00: Give Access to UART

(Push)
A1A: ACK
N1A:NACK

denoting message type.


Can be Extended:
Second Byte is ASCII char from
L00: LED Set/Ready
0-9 denoting processor number.
B00: Button Pressed
Third Byte is ASCII char from 0-9
R00: Request UART Access
denoting message data, except for
G00
(Pull)
ACK/NACK.
Cortex-A9
NIOS 0
Fourth Byte is always null
character \0 to terminate string
A0A
N0N
(human readable).

Lab 2: Inter-Processor Communication with Mailbox HW


Via Standard Linux Shared Memory (mmap)

20

Wait for Mailbox


Hardware message
empty flag
Send message (4 bytes)
Disable ARM/Linux
Access to UART
Wait for RX message
received flag
Re-enable ARM/Linux
UART Access

/* Map Physical address of Mailbox


to virtual address segment with Read/Write Access */
fd = open("/dev/mem", O_RDWR);
mbox0_address = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE,
MAP_SHARED, fd, 0xff260000);
<snip>
/* Waiting for Message Queue to empty */
while((*(volatile int*)(mbox0_address+0x2000+2) & 1) != 0
) {}
/* Send Granted/Go message to NIOS */
send_message = "G00";
*(mbox0_address+0x2000) = *(int *)send_message;
/* Disable ARM/Linux Access to UART (be careful here)*/
config.c_cflag &= ~CREAD;
if(tcsetattr(fd, TCSAFLUSH, &config) < 0) { }
/* Wait for Received Message */
while((*(volatile int*)(mbox0_address+2) & 2) == 0 ) {}
/* Re-enable UART Access */
config.c_cflag |= CREAD;
tcsetattr(fd, TCSAFLUSH, &config);
/* Read Received Message */
printf(" - Message Received. DATA = '%s'.\n", (char*)
(mbox0_address));

Lab 3: Putting It All Together Tetris!


Combining Locking and Communication
Topics Covered:
Linux Mutex

Key Example Code Provided:


C code showcasing using Mutexes for locking shared peripheral access
C code for multiple processor subsystem bringup and shutdown

Full step-by-step instructions are included in lab manual.


User to add code for second NIOS processor bringup, shutdown and
locking/control.

21

Lab 3: Hardware Design Overview (Same As Lab 2)


NIOS 0 & 1 Subsystems
NIOS Gen 2 processor
64k combined instruction/data
RAM
GPIO (4 out, LED)
GPIO (2 in, Buttons)
Mailbox

ARM Subsystem

2 Cortex-A9 (only using 1)


DDR3 External Memory
SD/MMC Peripheral
UART Peripheral

Subsystem 1
SD/MMC
Cortex-A9

UART
GPIO
MBox

RAM

MBox

RAM

NIOS 0

NIOS 1

GPIO
Subsystem 2

GPIO
Subsystem 3

Shared Peripherals
22

EMIF

Dedicated Peripherals

Lab 3: Programmer View - Processor Address Maps

NIOS 0 & 1

23

ARM Cortex-A9

Address Base

Peripheral

Address Base

Peripheral

0xFFC0_2000

ARM UART

0xFFC0_2000

UART

0x0007_8000

Mailbox (from ARM)

0x0007_8000

Mailbox (to NIOS 1)

0x0007_0000

Mailbox (to ARM)

0x0007_0000

Mailbox (from NIOS 1)

0x0005_0000

GPIO (In Buttons)

0x0006_8000

Mailbox (to NIOS 0)

0x0003_0000

GPIO (Out LEDs)

0x0006_0000

Mailbox (from NIOS 0)

0x0002_0000

System ID

0xC003_0000

GPIO (LEDs)

0x0000_0000

On-chip RAM

0xC002_0000

System ID

0xC001_0000

NIOS 1 RAM

0xC000_0000

NIOS 0 RAM

Available Linux Locking/Synchronization Mechanisms


Need to share peripherals
Choose a Locking Mechanism

Available in Linux

Mutex <- Chosen for this Lab


Completions
Spinlocks
Semaphores
Read-copy-update (decent for multiple
readers, single writer)
Seqlocks (decent for multiple readers, single
writer)

Available for Linux


MCAPI - openmcapi.org

24

Tetris Message Protocol Extended from Lab 2


NIOS Control Flow:
Wait for button press
Send Button press message
Wait for ACK (Free to write to
LED GPIO)
Write to LED GPIO
Send LED ready msg
Wait for ACK

ARM Control Flow:


Wait for button press message
Lock LED GPIO Peripheral
Send ACK (Free to write to
LED GPIO)
Wait for LED ready msg
Send ACK
Read LED value
Release Lock/Mutex
25

B00

NIOS 0
A0A

L00
A0A
NIOS 1

B10
A1A
L10
A1A

Cortex-A9

Lab 3: Locking Hardware Peripheral Access


Via Linux Mutex
pthread_mutex_t lock;
<snip Initialize/create/start>
/* Initialize Mutex */
err = pthread_mutex_init(&lock, NULL);

In this example, LED GPIO is


accessed by multiple
processors
Wrap LED critical section
(LED status reads) with:

pthread_mutex_lock()
pthread_mutex_unlock()

Also need Mutex init/destroy:


pthread_mutex_init()
pthread_mutex_destroy()

/* Create 2 Threads */
i=0;
while(i < 1)
{
err = pthread_create(&(tid[i]), NULL,
&nios_buttons_get, &(nios_num[i]));
i++;
}
<snip Critical Section>
pthread_mutex_lock(&lock);
/* Critical Section */
pthread_mutex_unlock(&lock);
<snip Stop/Destroy>
/* Wait for threads to complete */
pthread_join(tid[0], NULL);
pthread_join(tid[1], NULL);
/* Destroy/remove lock */
pthread_mutex_destroy(&lock);

26

References

27

Altera References
System Design Tutorials:

http://www.alterawiki.com/wiki/Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab_-_Creating_Your_A
XI3_Component
Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab
Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop
http://www.alterawiki.com/wiki/Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop_-_LAB2

Multiprocessor NIOS-only Tutorial:

http://www.altera.com/literature/tt/tt_nios2_multiprocessor_tutorial.pdf

Quartus Handbook:

https://www.altera.com/en_US/pdfs/literature/hb/qts/quartusii_handbook.pdf

Qsys:

System Design with Qsys (PDF) section in the Handbook

Qsys Tutorial: Step-by-step procedures and design example files to create and verify a system in Qsys

Qsys 2-day instructor-led class: System Integration with Qsys

Qsys webcasts and demonstration videos

SoC Embedded Design Suite User Guide:

https://www.altera.com/en_US/pdfs/literature/ug/ug_soc_eds.pdf

Related Articles
Performance Analysis of Inter-Processor Communication Methods
http
://www.design-reuse.com/articles/24254/inter-processor-communication-multi-core
-processors-reconfigurable-device.html

Communicating Efficiently between QorlQ Cores in Medical


Applications
https://cache.freescale.com/files/32bit/doc/brochure/PWRARBYNDBITSCE.pdf

Linux Inter-Process Communication:


http://www.tldp.org/LDP/tlk/ipc/ipc.html

Linux locking mechanisms (from ARM):


http://infocenter.arm.com/help/index.jsp?topic=/
com.arm.doc.dai0425/ch04s07s03.html

OpenMCAPI:
https://bitbucket.org/hollisb/openmcapi/wiki/Home

Mutex Examples:
http://www.thegeekstuff.com/2012/05/c-mutex-examples/
29

Thank You

Full Tutorial Resources Online


Project Wiki Page:

http://rocketboards.org/foswiki/Projects/BuildingMultiPro
cessorSystems

Includes:
Source code
Hardware source
Hardware Quartus Projects
Software Eclipse Projects

BACKUP SLIDES

Post-Lab 1 Additional Topics


Hardware Design Flow and FPGA Boot with U-boot and SD/MMC

32

Building Hardware:
Qsys (Hardware System Design Tool) User Interface

Interfaces
Exported
In/out of
system

Connections
between cores

33

Hardware and Software Work Flow Overview


Preloader & U-Boot

Quartus
&
Qsys

Eclipse
DS-5 & Debug Tools
Device Tree

RBF

Inputs:
Hardware Design (Qsys or RTL or Both)

Outputs (to load on boot media):


Preloader and U-boot Images
FPGA Programmation File: Raw Binary Format (RBF)
Device Tree Blob
34

SDCARD Layout
Partition 1: FAT

Uboot scripts
FPGA HW Designs (RBF)
Device Tree Blobs
zImage
Lab material

Partition 2: EXT3 Rootfs


Partition 3: Raw
Uboot/preloader

Partition 4: EXT3 Kernel src

35

Updating SD Cards
File

Update Procedure

zImage

Mount DOS SD card partition 1 and


replace file with new one:
$ sudo mkdir sdcard
$ sudo mount /dev/sdx1 sdcard/
$ sudo cp <file_name> sdcard/
$ sudo umount sdcard

soc_system.rbf
soc_system.dtb
u-boot.scr
preloader-mkpimage.bin

$ sudo dd if=preloader-mkpimage.bin
of=/dev/sdx3 bs=64k seek=0

u-boot-socfpga_cyclone5.img

$ sudo dd if=u-bootsocfpga_cyclone5.img of=/dev/sdx3


bs=64k seek=4

root filesystem

$ sudo dd if=altera-gsrd-imagesocfpga_cyclone5.ext3 of=/dev/sdx2

More info found on Rocketboards.org


http://www.rocketboards.org/foswiki/Documentation/GSRD141SdCard

Automated Python Script to build SD Cards:


make_sdimage.py

36

Post-Lab 2 Additional Topic


Using Eclipse to Debug: NIOS Software Build Tools

37

Altera NIOS Software Design and Debug Tools

Nios II SBT for Eclipse key


features:
New project wizards and
software templates
Compiler for C and C++
(GNU)
Source navigator, editor, and
debugger
Eclipse project-based tools
Download code to hardware

38

Key Multi-Processor System Design Points


Startup/Shutdown
Processor
Peripheral
Covered in Lab 1.

Communication between processors

What is the physical link?


What is the protocol & messaging method?
Message Bandwidth & Latency
Covered in Lab 2

Partitioning peripherals
Declare dedicated peripherals only connected/controlled by one
processor
Declare shared peripherals Connected/controlled by multiple processors
Decide Upon Locking Mechanism
Covered in Lab 3
39

Post Lab 3 Additional Topic


Altera SoC Embedded Design Suite

Altera Software Development Tools


Eclipse
For ARM Cortex-A9 (ARM Development Studio 5 Altera Edition)
For NIOS

Pre-loader/U-Boot Generator
Device Tree Generator
Bare-metal Libraries
Compilers
GCC (for ARM and NIOS)
ARMCC (for ARM with license)

Linux Specific
Kernel Sources
Yocto & Angstrom recipes: http://
rocketboards.org/foswiki/Documentation/AngstromOnSoCFPGA_1
Buildroot: http://
rocketboards.org/foswiki/Documentation/BuildrootForSoCFPGA
41

System Development Flow


FPGA Design Flow
Hardware
Development

42

Quartus II design software


Qsys system integration
tool
Standard RTL flow
Altera and partner IP
ModelSim, VCS, NCSim, etc.
AMBA-AXI and Avalon bus
functional models (BFMs)
SignalTap II logic
analyzer
System Console
Quartus II Programmer
In-system Update

Software Design Flow


Software
Development

Design

Design

Simulate

Simulate

Debug

Debug

Release

Release

Eclipse
GNU toolchain
OS/BSP: Linux, VxWorks
Hardware Libraries
Design Examples

GDB, Lauterbach, Eclipse

Flash Programmer

Inside the Golden System Reference Design


Complete system example design
with Linux software support
Target Boards:
Altera SoC Development Kits
Arrow SoC Development Kits
Macnica SoC Development Kits

Hardware Design:
Simple custom logic design in FPGA
All source code and Quartus II /
Qsys design files for reference

Software Design:
Includes Linux Kernel and
Application Source code
Includes all compiled binaries

43

---Topics Back Up--Introductions: Altera and SoC FPGAs


Development Tools
How to Build Hardware: FPGA Hardware Tools & Build Flow
How to Build Software: Software Tools & Build Flow
How to Debug all-of-the-above: Debug Tools

Key Multi-processor System Design Points


Hardware design
Shared peripherals
Available Hardware IP

Software design
Message Protocols
Linux tools/mechanism available today

44

Quartus Hardware Development Tool

Quartus II User Interface

Quartus II main window


provides a high level of
visibility to each stage of
the design flow
Project navigator provides direct

visual access to most of the key


project information
Tasks window allows you to use
the tools and features of the
Quartus II software and monitor
their progress from a flow-based
layout
Tool View window shows various
tools and design files
Messages window outputs
messages from each process
of the run

46

Project Navigator

Tool View
window

Tasks window

Messages window

Typical Hardware Design Flow


Project definition

Project
Project creation
creation

Design entry/RTL coding and early pin planning


Design
Design creation
creation

Functional
verification
Verify design behavior

Synthesis (mapping)
Functional
Functional verification
verification
Logic

Memory

I/O
Design
Design compilation
compilation

Translate design into device-specific primitives


Optimization to meet required area and
performance constraints

Placement and routing (fitting)


Place design in specific device resources with reference to
area and performance constraints
Connect resources with routing lines

Timing analysis

Functional verification
Verify design will work in
target technology

Behavioral or structural description of design


Early pin planning allows board development in parallel

Functional
Functional verification
verification

Verify performance specifications were met


Static timing analysis

PC board simulation and test


In-system
In-system debug
debug

47

Simulate board design


Program and test device on board
On-chip tools for debugging

Quartus II Feature Overview


Fully integrated development tool
Multiple design entry methods
Includes intellectual property- (IP-) based system design
Up-front I/O assignment and validation
Enables printed circuit board (PCB) layout early in the design process
Incremental compilation
Reduces design compilation and improves timing closure
Logic synthesis
Includes comprehensive integrated synthesis solution
Advanced integration with third-party EDA synthesis software
Timing-driven placement and routing
Physical synthesis
Improves performance without user intervention
Verification solution
TimeQuest timing analyzer
PowerPlay power analysis and optimization
Functional simulation
On-chip debug and verification suite

Project definition

Project
Project creation
creation

Design
Design creation
creation

Functional
Functional verification
verification
Memory

Logic
I/O

Design
Design compilation
compilation

Functional
Functional verification
verification

In-system
In-system debug
debug

48

Quartus II Feature Overview (1/2)


Feature
Project creation

Design entry

Quartus II Software
New project wizard
HDL editor
Schematic editor
State machine editor
MegaWizard Plug-In Manager
Customization and generation of IP
Qsys system integration tool

Design constraint assignments

Assignment editor
Pin planner
Synopsys Design Constraint (SDC) editor

Synthesis

Quartus II Integrated Synthesis (QIS)


Third-party EDA synthesis
Design assistant

Fitting and placing design into FPGA to meet


user requirements

Fitter (including physical synthesis)

Design analysis and debug

Netlist viewers
Advisors

Power analysis

PowerPlay power analyzer

49

Quartus II Feature Overview (2/2)


Feature

Quartus II Software

Static timing analysis on post-fitted design

TimeQuest timing analyzer

Viewing and editing design placement

Chip Planner

Functional verification

ModelSim-Altera edition
Third-party EDA simulation tools

Generation of device programming file

Assembler

On-chip debug and verification

Technique to optimize design and


improve productivity

Quartus II incremental compilation


Physical synthesis optimization
Design Space Explorer (DSE)

50

SignalTapTM II (embedded logic analyzer)


In-system memory content editor
Logic analyzer interface editor
In-system sources and probes editor
SignalProbe pins
Transceiver Toolkit
External memory interface toolkit

Quartus II Subscription Edition vs. Web Edition


Subscription Edition

Device supported

Software features:
Incremental compilation
and team-based design
SSN Analyzer
Transceiver Toolkit

MAX series devices: All


(Excluding MAX7000 / 3000)
Cyclone III/IV/V FPGAs: All
Arria II/V FPGAs: All
Stratix III, IV, V FPGAs: All
Cyclone V SoCs: All

Web Edition
MAX series devices: All (Excluding MAX7000 /
3000)
Cyclone V FPGAs: All (Excluding 5CEA9,
5CGXC9, and 5CGTD9)
Cyclone III/IV FPGAs: All
Arria II GX FPGA: EP2AGX45
Cyclone V SoCs: All

Yes

No

SignalTap II, SignalProbe

Yes

If TalkBack feature is enabled

Multi-processor support

Yes

If TalkBack feature is enabled

Yes

No license required for OpenCore Plus hardware


evaluation
License fee required for production use

Windows 32/64-bit
Linux 32/64-bit

Windows 32/64-bit
Linux 32/64-bit

Perpetual
(continues to work after
expiration)

No license required except for IP core

Free

IP Base Suite MegaCore


functions
Platform support
License and
maintenance terms
51Price

How to Get Started Using Quartus II Software


Download Quartus II software today and start designing with
Altera programmable logic devices
Quartus II Handbook - http://www.altera.com/literature/lit-qts.jsp
Guides you through the programmable logic design cycle from design to
verification
Also covers third-party EDA vendor tool interfaces

Online demonstrations - http://www.altera.com/quartusdemos


Easiest way to learn about the latest Quartus II software features and
design flows

Training classes - https://mysupport.altera.com/etraining


Offers online training classes and live presentation coupled with hands-on
exercises to learn about Quartus II features and design flows

Agenda
52

Qsys System Integration Platform

Qsys System Integration Platform


High-Performance Interconnect
Design Reuse

Hierarchy
Based on Network-on-a-Chip (NoC)
Architecture

Real-Time System Debug

AMBA AXI, APB, AHB

Qsys is Alteras design environment for

54

Design
System

Add to
Library

Automated Testbench Generation

Industry-Standard Interfaces
Avalon Interfaces

Package as IP

Deployment of IP, with hierarchal support


Development platform for Altera custom solutions
Design platform for customers to quickly create system designs

Qsys User Interface

Toolbar

55

Interfaces
Exported
for Hierarchy

Improved Validation
Display

Qsys Benefits
Raises the level of design abstraction
System-level design and system visualization

Simplifies complex hierarchal system development


Automated interconnect generation

Provides a standard platform


IP integration, custom IP authoring, IP verification

Enables design re-use


Reduces time to market
System-level design reduces development time
Facilitates verification

Qsys improves productivity


56

Network-on-Chip Architecture
Transaction Layer
Layer
Transaction

Converts transactions
transactions to
to
Converts
command packets
packets and
and
command
responses
packets
to
responses packets to
responses
responses
Avalon-MM
AXI-MM

57

Transport
Transport Layer
Layer
Transfers
Transfers packets
packets to
to destination
destination

Transaction
Transaction Layer
Layer
Converts
Converts command
command
packets
to
packets to transactions
transactions
and
and responses
responses to
to
response
packets
response packets
Avalon-MM
AXI-MM

Avalon-ST

Master
Interface

Master
Network
Interface

Avalon ST
Network
(Command)

Slave
Network
Interface

Slave
Interface

Master
Interface

Master
Network
Interface

Avalon ST
Network
(Response)

Slave
Network
Interface

Slave
Interface

Benefits of Network-On-Chip Approach


See white paper: Applying the Benefits of NoC
Architecture to FPGA System Design
Independent implementation of transaction/transport layers
Different transport layer network topologies can be implemented without
transaction layer modification
e.g. High performance components on a wide high-frequency crossbar network

Supports standard interface interoperability


Mix and match interface types on transaction layer without transport layer
modification

Scalability
Segment network into sub-networks using
Bridges
Clock crossing logic

58

Industry-Standard Interfaces

Developer

Standard Interface Protocol


Avalon Interfaces

AMBA AXI, AMBA APB, and AMBA AHB

Qsys supports mixing of different


interfaces
59

Target Qsys Applications


Qsys can be used in almost every FPGA design
Designs fall into two categories
Control plane
Memory mapped
Reading and writing to control and status registers

Data plane
Streaming
Data switching (muxing, demuxing), aggregation, bridges

PacketsI care about Latency!


Qsys packet format is wide
Packet format contains a complete transaction in a single clock cycle
Supports:
Writes with 0 cycles of latency
Reads with a round-trip latency of 1 cycle

You can control latency via Qsys configuration

Separate command and response network


Increases concurrency
Command traffic and Response traffic dont compete for resources

61

Qsys: Wide Range of Compliant IP


Wide range of plug-and-play intellectual
property (IP):
Interface protocol IP
E.g. PCIe, Ethernet 10/100/1000 Mbps (TripleSpeed Ethernet), Interlaken, JTAG, UART, SPI

External memory interface IP


E.g. DDR/DDR2/DDR3

Video and imaging processing (VIP) IP


E.g. VIP Suite including scaler, switch,
deinterlacer, and alpha blending mixer

Embedded processor IP
E.g. Hardened ARM processor system, Nios II
processor

Verification IP
E.g. Avalon-MM/-ST, AXI4, APB

>100 Qsys compliant IP available


62

Qsys as a Platform for System Integration


Library of
Available IP

Connect IP and
Systems

Interface protocols
Memory
DSP
Embedded
Bridges
PLL
Custom systems

Accelerate
Development

IP 1
Custom 1
IP 2
IP 3
Custom 2

HDL
HDL

Simplify
Integration

Automate Error-Prone Integration Tasks


63

Additional Resources
Watch online demos (3-5 min)
www.altera.com/qsys

Complete the Qsys tutorial (2-3 hrs)


www.altera.com/qsys

Watch free webcasts (10-15 mins)


www.altera.com/qsys

Sign up for Qsys training


www.altera.com/training
64

In-system Verification

Debug Challenges
Accessing and viewing internal signals
Not enough pins to use as test points
Capabilities in creating trigger conditions that correctly
capture data
Verification of standard or proprietary protocol interfaces
Overall design process bottleneck

Debug Can Be Costly

66

On-chip Debug
Access and view internal signals
Store captured data in FPGA embedded memory
Use JTAG interface as debug ports
Incrementally add internal signals to view

Reduce
Debug Cycles by
Using On-chip Debug Tools

67

On-chip Debug Technology


Debug tools communicate with the FPGA via standard
JTAG interface
Multiple debug functions can share the JTAG interface
simultaneously
Alteras system-level debugging (SLD) hub technology makes
this possible
All Altera tools and some third-party tools support the SLD hub JTAG
interface
FPGA
Node
1

Download
Cable

68

JTAG
Tap
Controller

SLD
Hub

User's
Design
(Core Logic)
Node
2

Node
N
Node
N-1

On-chip Debug Tools in Quartus II Software


SignalTap II logic analyzer
Captures and displays hardware events, fast turnaround times
Incrementally creates trigger conditions and adds signals to view
Uses captured data stored in on-chip RAM and JTAG interface for communication

In-system memory content editor


Displays content of on-chip memory
Enables modification of memory content in a running system

External logic analyzer interface


Uses external logic analyzer to view internal signals
Dynamically switches internal signals to output

In-system sources and probes


Stimulate and monitor internal signals without using on-chip RAM

Exception: SignalProbe incremental routing feature does not use JTAG


interface (i.e. SLD hub technology)
Quickly routes an internal node to a pin for observation

69

SignalTap II Logic Analyzer


Provides the most advanced triggering capabilities available in an
FPGA-embedded logic analyzer
Proven to be invaluable in the lab
Captures bugs that would take weeks of simulation to uncover

Has broad customer adoption


Features and benefits
An embedded logic analyzer
Uses available internal memory
Probes state of internal signals without using external equipment or
extra I/O pins
Incremental compilation support
Fast turnaround time when adding signals to view
Advanced triggering for capturing difficult events/transactions
Power-up trigger support
Debug the initialization code
Megafunction support
Optionally, instantiate in HDL

70

In-system Memory Content Editor


Enables FPGA memory content and design constants to be updated insystem, via JTAG interface, without recompiling a design or reconfiguring
the rest of the FPGA

Fault injection into system


Update memory while system is running
Change value of coefficients in DSP applications
Easily perform what if? type experiments in-system in just seconds

Supports MIF and HEX formats for data interchange


Megafunctions supported
LPM_CONSTANT, LPM_ROM, LPM_RAM_DQ, ALTSYNCRAM (ROM and single-port
RAM mode)

Enable
Enable memory
memory
content
content editor
editor

71

In-system Memory Content Editor


Under Tools menu In-system Memory Content Editor

72

Altera SoC Embedded Design Suite

Included in SoC Embedded Design Suite (EDS)


Development Studio 5 Altera Edition
Awesome debugger, especially when combined with
USB Blaster II
Altera SoC FPGA System Trace Macrocells

Application development environment


Streamline system analyzer

Hardware Libraries
GNU-based bare-metal (EABI) compiler tools
U-Boot
Root file system to jump start software development
Pre-built Linux kernel
http://www.rocketboards.org for source trees and community access

74

System Development Flow


FPGA Design Flow
Hardware
Development

75

Quartus II design software


Qsys system integration
tool
Standard RTL flow
Altera and partner IP
ModelSim, VCS, NCSim, etc.
AMBA-AXI and Avalon bus
functional models (BFMs)
SignalTap II logic
analyzer
System Console
Quartus II Programmer
In-system Update

Software Design Flow


Software
Development

Design

Design

Simulate

Simulate

Debug

Debug

Release

Release

ARM Development Studio


5
GNU toolchain
OS/BSP: Linux, VxWorks
Hardware Libraries
Design Examples

GNU, Lauterbach, DS5

Flash Programmer

Altera SoC Embedded Design Suite


FPGA Design Flow

Software Design Flow

Hardware
Development

76

Quartus II design software


Qsys system integration
tool
Standard RTL flow
Altera and partner IP
ModelSim, VCS, NCSim, etc.
AMBA-AXI and Avalon bus
functional models (BFMs)
SignalTap II logic
analyzer
System Console
Quartus II Programmer
In-system Update

Software
Development

Design

HW/SW
Handoff

Simulate

Design

Simulate

ARM Development Studio


5
GNU toolchain
OS/BSP: Linux, VxWorks
Hardware Libraries
Design Examples
Virtual
Target
Software

Development

Debug
Release

FPGA-Adaptive
Debugging

Debug
Release

GNU, Lauterbach, DS5

Flash Programmer

Altera SoC Embedded Design Suite


Comprehensive Suite SW Dev Tools

Hardware-toSoftware
Handoff

Hardware / software handoff tools


Linux application development
Yocto Linux build environment
Pre-built binaries for Linux / U-Boot

Firmware
Development

Linux
Application
Development

Work in conjunction with the Community Portal

Bare-metal application development


SoC Hardware Libraries

FPGAAdaptive
Debugging

Bare-metal compiler tools

FPGA-adaptive debugging
ARM DS-5 Altera Edition Toolkit

Design examples

77

Free Web Edition


Subscription Edition
Free 30-day Eval

Hardware-to-Software Handoff
Hardware

Qsys system info, SDRAM calibration files,


ID / timestamp, HPS IOCSR data

system.iswinfo

Software
78

system.sopcinfo

Preloader
Generator

Device Tree
Generator

.c & .h
source files

Linux
Device Tree

Hardware / Software Handoff Tools

79

Allow hardware and software teams to work


independently and follow their familiar design flows
Take Altera Quartus II / Qsys output files and
generate handoff files for the software design flow
Device Tree standard specifies hardware connectivity
so that Linux kernel can boot up correctly

Linux Application Development


Yocto build support for Linux
Yocto standard enables open, versatile, and
cost-effective embedded software development
Allows a smooth transition to commercial Linux distributions

Pre-built Linux kernel, U-Boot, and root file system to jump


start software development
Link to community portal for source trees and community access

80

Bare-metal Application Development


Hardware Libraries
Software interface to all system
registers
Functions to configure some basic
system operations
(e.g. clock speed settings, cache
settings, FPGA configuration, etc.)
Support board bring-up and
diagnostics development
Can be used by bare-metal
application, device drivers, or
RTOS

GNU-based bare-metal
(EABI) compiler tools
81

Application
Operating
System
BSP
Hardware
BMAL
HAL
PAL
Libraries
SoC FPGA

Baremetal
App

Golden System Reference Design


Complete system design with
Linux software support
Simple custom logic design in
FPGA
All source code and Quartus II /
Qsys design files for reference
Include all compiled binariesexample can run on an Altera
SoC Development Kit to
jumpstart development

82

DS-5 Altera Edition- One Tool, Three Usages

1
JTAG-Based Debugging

Board Bring-up

OS porting, Drivers Dev,

System Integration

Kernel Debug

System Debug

Application Debugging

83

Linux User Space Code

RTOS App Code

3
FPGA-Adaptive Debugging

One Device, Two Debugging Tools?


ARM DS-5 Toolkit

Altera Quartus II Software

JTAG
DSTREAM

84

Dedicated JTAG connection


Visualize & control CPU
subsystem

JTAG

Dedicated JTAG connection


Visualize & control FPGA

One Device, Two Debugging Tools?


ARM DS-5 Toolkit

Altera Quartus II Software

g
n
i
g
g
u
b
De
Barrier

isualize
v
o
t
le
b
a
c
tool/
No single
and
U
P
C
h
t
o
b
l
and contro
ains
FPGA dom
GA to
P
F
d
n
a
U
r CP
No way fo
JTAG
elate
r
r
o
c
d
n
a
r
DSTREAM
cross trigge d software events
ar e an
w
d
r
a
h
JTAG
Dedicated JTAG connection
n
a
c
r
e
g
g
deb u
Visualize & control CPU
No fixed
Dedicated JTAG connection
of
s
d
e
e
n
e
subsystem
h
t
areVisualize & control FPGA
address
w
d
r
a
h
A
G
FP
changing

85

Industry First: FPGA-Adaptive Debugging

Altera
Altera
USB-Blaster
USB-BlasterII
II
Connection
Connection

ARM Development Studio 5 (DS-5) Altera Edition Toolkit


Removes debugging barrier between CPUs and FPGA
Exclusive OEM agreement between Altera and ARM
Result of innovation in silicon, software, and business model
86

FPGA-Adaptive Debugging Features


Single USB-Blaster II cable for
simultaneous SW and HW debug
Automatic discovery of FPGA peripherals
and creation of register views
Hardware cross-triggering between the CPU and FPGA domains
Correlation of CPU software instructions and FPGA hardware
events
Simultaneous debug and trace for Cortex-A9 cores and
CoreSight-compliant cores in FPGA
Statistical analysis of software load and bus traffic spanning the
CPUs and FPGA

87

DS-5 Altera Edition


Productivity-Boosting Features
Industrys most advanced multicore
debugger for ARM
JTAG based system-level debugging,
gdbserver-based application
debugging
in one package
Yocto plugin to enable
Linux based application development
Integrated OS-aware analysis
and debug capability

88

Visualization of SoC Peripherals


Register views assist the
debug of
FPGA peripherals
File generated by FPGA tool
flow
Automatically imported in
DS-5 Debugger

Debug views for debug of


software drivers
Self-documenting
Grouped by peripheral,
register and bit-field
CMSIS

Peripheral register
descriptions
89

FPGA-Adaptive, Unified Debugging


FPGA connected to debug and trace buses for nonintrusive capture and visualization of signal events
Simultaneous debug
and trace connection to CPU cores
and compatible IP
Correlate
FPGA signal
events with
software events
and CPU
instruction
trace using
triggers and
timestamps

90

Cross-Domain Debug 1
Trigger from software world to FPGA world
SOFTWARE TRIGGER

HARDWARE TRIGGER!

91

Cross-Domain Debug 2
Trigger from FPGA world to software world
HARDWARE TRIGGER

EXECUTION STOP
OR
HW TRACE TRIGGER

92

EXECUTION STOP
OR
SW TRACE TRIGGER

Correlate HW and SW Events


Debug event trigger point
set from either:

ARM DS-5 Toolkit

SignalTap II Logic
Analyzer
or
DS-5 debugger
Timestamp
Timestamp Correlated
Correlated

Captured trace can then


be analyzed using
timestamp-correlated
events

93

SignalTap II Logic Analyzer

System-Level
Performance Analysis
Performance
bottlenecks in SoCs
often come from the
CPU interaction with
the rest of the SoC
Streamline visualizes
software activity with
performance counters
from the SoC and
FPGA to enable full
system-level analysis
Streamline only
requires a TCP/IP
connection to the SoC
94

ARM DS-5 Streamline


Linux OS Counters

Processor Counters,
Aggregated, or Per Core

Power Consumption
FPGA Block Counters

Process/Thread Heat Map

Application Events

Altera SoC EDS- Key Benefits


One-stop shop from Altera
All the tools and examples for rapid starts
Familiar tools interface, easy to use
Share tools and knowledge to increase team productivity

Best multicore debugger tools for ARM architecture


Unprecedented visibility and control across
processor cores and across CPU, FPGA domains

Faster time to market, lower development costs!

95

Target Users and Usages


Web
Edition
Board Bring-up

Yes

Device Drivers Dev

Yes

OS Porting

Yes

Baremetal Programming

Yes

RTOS Based App Dev

Yes

Linux Based App Dev

96

Subscription
Edition

Yes

Yes

Multicore App Debugging

Yes

System Debugging

Yes

SoC EDS Editions Summary


Component
Hardware/Software
Handoff Tools

ARM DS-5 Altera


Edition

Web
Edition

Subscription
Edition

30-Day
Evaluation

Preloader Image Generator

Flash Image Creator

Device Tree Generator (Linux)

Eclipse IDE

Key Feature

ARM Compiler*
Debugging over Ethernet (Linux)

Debugging over USB-Blaster II JTAG

Automatic FPGA Register Views

Hardware Cross-triggering

CPU/FPGA Event Correlation

CodeBench Lite EABI (Bare-metal)

Bare-metal programming Support

Golden System Reference Design

Compiler Tool Chains Linaro Tool Chain (Linux)


Hardware Libraries
SoC Programming
Examples

x
x

*ARM Compiler is available in DS-5 Professional Edition, available directly from ARM
97

Coordinated Multi-Channel Delivery

Altera.com
Quartus II
Programmer
SignalTap II

98

Altera.com

RocketBoards.org

Pre-built Binaries
Kernel
U-Boot
Yocto
Minimal RFS
Tool chains
Handoff tools
HW Libraries
Examples
Documentation

Frequent Updates
Kernel source
U-Boot source
Yocto source
RFS source
Toolchain
source
Public git
Wiki
Mailman

Partners
BSPs
Middleware
3rd Party Tools

Altera NIOS Software Design Tools


Nios II SBT for Eclipse key
features:
New project wizards and
software templates
Compiler for C and C++
(GNU)
Source navigator, editor, and
debugger
Eclipse project-based tools

99

You might also like