Tutorial 07 FPGA Clock Signals

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

Programmable Logic Devices

Tutorial 7
Michal Kubíček
Department of Radio Electronics, FEEC BUT Brno
Vytvořeno za podpory projektu OP VVV Moderní a otevřené studium techniky CZ.02.2.69/0.0/0.0/16_015/0002430.
Tutorial 7

❑ Clock signal distribution in FPGAs


❑ Clock management
❑ Slow clock signals, clock enabling

page 2 kubicek@vutbr.cz
Clock signals in FPGA

Clock signal distribution on


FPGAs

page 3 kubicek@vutbr.cz
Clock signals in FPGA

Clocking infrastructure
On FPGA there are many components related to the clock signal distribution and
management. Together they are called CLOCK RESOURCES.

❑ Dedicated network for clock signal distribution – clock tree (low-skew, high-
fanout). Large FPGAs feature several levels of clock distribution system (regional /
global).
❑ Buffers and multiplexers for clock signal inputs, signal conditioning, switching...
❑ Blocks for clock signal modification - Clock Management; based on PLL or DLL.

page 4 kubicek@vutbr.cz
Clock signals in FPGA

Clock tree
• Low skew, Low propagation delay: same (low) delay from a source buffer
(BUFG, BUFH...) to all destination nodes

• High fanout: capable of driving many nodes (thousands)

• Primarily for clock distribution but can be used for other high fanout signals, like
CLOCK ENABLE, SET/RESET....

KČ3

KČ1 REG KČ2 REG


clk clk

page 5 kubicek@vutbr.cz
Clock signals in FPGA

Example:
Spartan-3E

page 6
Clock signals in FPGA

Example:
Virtex-7

page 7
Clock signals in FPGA

Regional clock structure detail: Xilinx 7-series

Global and regional


buffers for clock signal
distribution

page 8
Clock signals in FPGA

IO clock structure detail: Xilinx 7-series

I/O buffers for clock


signal distribution

page 9
Clock signals in FPGA

IO clock structure detail: Xilinx 7-series

page 10 kubicek@vutbr.cz
How to use the clock resources

page 11 kubicek@vutbr.cz
Clock signals in FPGA

Clock signal resources


Clock for flip-flops: VHDL code inference

PROCESS (clk) BEGIN


IF rising_edge(clk) THEN
cnt <= cnt + 1;
END IF;
END PROCESS;

For clk signal the global clock network is automatically used.

page 12 kubicek@vutbr.cz
Clock signals in FPGA

Clock signal resources


Clock gating: VHDL code inference?

PROCESS (clk_250M, clk_enable) BEGIN


IF clk_enable = '0' THEN
clk <= '0'; -- stop clock (low power mode)
ELSE
clk <= clk_250M; -- normal operation
END IF;
END PROCESS;

Not reliable!!!

page 13 kubicek@vutbr.cz
Clock signals in FPGA

Clock signal resources


Clock MUXing: VHDL code inference?

PROCESS (clk_250M, clk_10M, set_high_speed) BEGIN


IF set_high_speed = '1' THEN
clk <= clk_250M; -- fast clock for processing
ELSE
clk <= clk_10M; -- slow clock for idle operation
END IF;
END PROCESS;

Not reliable!!!

page 14 kubicek@vutbr.cz
Clock signals in FPGA

Manual instantiation
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGCE_inst : BUFGCE
port map (
O => O, -- Clock buffer output
CE => CE, -- Clock enable input
I => I); -- Clock buffer input

page 15 kubicek@vutbr.cz
Clock signals in FPGA

Manual instantiation
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGMUX_inst : BUFGMUX
port map (
O => O, -- Clock MUX output
I0 => I0, -- Clock0 input
I1 => I1, -- Clock1 input
S => S); -- Clock select input

page 16 kubicek@vutbr.cz
Clock signals in FPGA

Clock resources (Xilinx 7-series)

Glitch-free clock switching (and more)

BUFGCTRL is designed to switch between two clock


inputs without the possibility of a glitch. When the
presently selected clock transitions from High to Low
after S0 and S1 change, the output is kept Low until the
other (to-be-selected) clock transitions from High to Low.
Then the new clock starts driving the output.

page 17 kubicek@vutbr.cz
Clock signals in FPGA

Clock signal input to the FPGA


Preferred method: use dedicated clock capable pins
Library UNISIM;
use UNISIM.vcomponents.all;

IBUFG_inst : IBUFG port map ( -- PIN input buffer


O => clk_50, -- Clock buffer output
I => clk_50_PIN ); -- Clock buffer input

IBUFGDS_inst : IBUFGDS port map ( -- diff. pair PIN input buffer


O => clk_sys, -- Clock buffer output
I => clk_in_P, -- Diff_p clock buffer input
IB => clk_in_N ); -- Diff_n clock buffer input

page 18 kubicek@vutbr.cz
Clock signals in FPGA

Clock input to FPGA


There are dedicated pins suitable for clock signal input (marked as GC, CC, MRCC, SRCC...).
They are able to connect signals directly to clock resources (BUFG, CMT...). But beware of many
rules and exceptions specific for each FPGA family!
Clock signals in FPGA

Clock input to FPGA


In some cases a violation of those tricky rules is not fatal. However, any solutions usually results
in some penalty that may cause problems during Static Timing Analysis (STA).

Library UNISIM;
use UNISIM.vcomponents.all;

IBUF_inst : IBUF port map ( -- PIN input buffer


O => clk_50_AUX, -- Buffer output
I => clk_50_PIN ); -- Buffer input

BUFG_inst : BUFG port map ( -- internal signal buffer


O => clk_50, -- Clock buffer output
I => clk_50_AUX ); -- Clock buffer input

clk_50_PIN clk_50

page 20 kubicek@vutbr.cz
Clock signals in FPGA

Clock input to FPGA

page 21 kubicek@vutbr.cz
Clock signals in FPGA

Clock Management
Blocks for clock signal
conditioning
CMT, DCM, PLL, DLL, MMCM...

page 22 kubicek@vutbr.cz
Clock signals in FPGA

Spartan-3: Digital Clock Manager


Clock signal conditioning (phase shifting, synthesis)

page 23 kubicek@vutbr.cz
Clock signals in FPGA

Spartan-3: Digital Clock Manager

page 24
Clock signals in FPGA

Use of DCM (signals reset and locked)


clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M
IBUFG
DCM clk_100M
clk_250M
reset locked

page 25 kubicek@vutbr.cz
Clock signals in FPGA

7-series: Clock Management Tile (CMT)

Up to 24 CMTs in a single FPGA

page 26 kubicek@vutbr.cz
Clock signals in FPGA

7-series: MMCM block diagram


Mixed Mode Clock Manager

page 27 kubicek@vutbr.cz
Clock signals in FPGA

7-series: PLL block diagram


Phase Locked Loop

page 28 kubicek@vutbr.cz
Clock signals in FPGA

7-series: MMCM use case

page 29 kubicek@vutbr.cz
Clock signals in FPGA

7-series: MMCM use case

page 30 kubicek@vutbr.cz
Clock signals in FPGA

7-series: MMCM use case

page 31 kubicek@vutbr.cz
Clock signals in FPGA

Synchronous clock domains


clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M

DCM clk_100M
clk_250M
reset Locked

page 32 kubicek@vutbr.cz
Clock signals in FPGA

Synchronous clock domains


clk_50M_ibufg
clk_50M_pin clk_50M
clk_33M

DCM clk_100M
clk_250M
reset Locked

page 33 kubicek@vutbr.cz
Clock signals in FPGA

Synchronous clock domains


clk_50M_pin

clk_50M

clk_100M

In this case the clk_50M a clk_100M clock domains are synchronous ➔ there is no need to
use synchronizers on these clock domain boundaries to transfer data or control signals. Static
timing analysis tool can correctly analyze all the necessary timing parameters.

page 34 kubicek@vutbr.cz
Clock signals in FPGA

Synchronous clock domains D Q D Q


clk_120M clk_100M

clk_100M T=10 ns

clk_120M T=8.33 ns

1.66 ns
In this case the clk_100M and clk_120M domains are also synchronous but because of specific
frequency difference there are situations where the timing budget is very tight (1.66 ns in this
case). This effectively requires usage of synchronizers in between these clock domains (they
must be treated as asynchronous).
The STA is considering all the possible edge delay combinations and requires the design to meet
the most strict one (worst case) to meet SETUP and HOLD requirements.

page 35 kubicek@vutbr.cz
Clock signals in FPGA

Zynq 7000: Clocking Wizard

VIVADO example

page 36 kubicek@vutbr.cz
Slow clock signals

page 37 kubicek@vutbr.cz
Slow clock signals

Why to use slow frequency clock (Hz, kHz)?


Significant saving of HW resources in naturally slow acting blocks:
❑ User interfaces – buttons, keyboards, LEDs, simple displays
❑ Slow communication interfaces – UART, SPI, I2C...
❑ ...

clk_slow

clk_fast

page 38 kubicek@vutbr.cz
Slow clock signals

How (not) to generate a slow clock


❑ Direct clock division using logic is not recommended:

clk

Ideal
clk_div

Real
clk_div

clk_slow_gen: PROCESS (clk) BEGIN


IF rising_edge(clk) THEN
D nQ clk_div clk_div <= NOT clk_div;
clk END IF;
END PROCESS clk_slow_gen;

page 39 kubicek@vutbr.cz
Slow clock signals

How (not) to generate a slow clock


❑ Direct clock division using logic is not recommended:

clk

Ideální
clk_div

Skutečné
clk_div

The delay of the clock dividing circuitry


D nQ clk_div causes the new clock to be asynchronous to
clk
the original one (the delay is unpredictable
and varies with each implementation).

page 40 kubicek@vutbr.cz
Slow clock signals

Derived clock signal distribution


Without special care the new clock signal is distributed using a general purpose
interconnect. This results in excessive skew and subsequent setup/hold time violations.

clk_slow_gen: PROCESS (clk) BEGIN


IF rising_edge(clk) THEN
clk_div <= NOT clk_div;
END IF;
END PROCESS clk_slow_gen;

D Q clk_div
clk

proc_cnt_slow: PROCESS (clk_div) BEGIN


IF rising_edge(clk_div) THEN
cnt_slow <= cnt_slow + 1;
END IF;
END PROCESS proc_cnt_slow;

page 41 kubicek@vutbr.cz
Slow clock signals

Derived clock signal distribution


Use a dedicated Clock Tree for clock signal distribution
The derived clock signal is routed through a
dedicated buffer (ex. BUFG fo Xilinx FPGAs).
Usually a structural description is used to
instantiate one.
Usage of the dedicated clock buffer driving the
clock tree ensures SKEW elimination.
The problem of asynchronicity of primary and
derived clock domain persists.

page 42 kubicek@vutbr.cz
Slow clock signals

Derived clock signal distribution


Use a dedicated Clock Tree for clock signal distribution
Library UNISIM;
use UNISIM.vcomponents.all;

BUFG_inst : BUFG -- internal signal buffer


PORT MAP(
O => clk_div_BUFGOUT, -- Clock buffer output
I => clk_div -- Clock buffer input
);

clk_div clk_div_BUFGOUT

page 43 kubicek@vutbr.cz
Slow clock signals

Derived clock signal distribution


Use a dedicated Clock Tree for clock signal distribution

clk_slow_gen: PROCESS (clk) BEGIN


IF rising_edge(clk) THEN
clk_div <= NOT clk_div;
END IF;
END PROCESS clk_slow_gen;

BUFG_inst : BUFG
PORT MAP (
O => clk_div_BUFGOUT,
I => clk_div );

!!!
clk_div BUFG clk_div_BUFGOUT
Slow clock signals

Derived clock signal distribution


Use a dedicated Clock Tree for clock signal distribution
❑ There is always a delay between the primary and the derived clock domain; this delay is
not well defined and may change with each implementation attempt.
❑ The SKEW is eliminated.
❑ Source of the derived clock signal must be a REGISTER, nevere a LUT or other
combinatorial block (as their output may contain glitches).

The delay between primary an derived clock domain may cause SETUP / HOLD timing problems on
signals crossing clock domain boundary ➔ synchronizers are a must!

There is no problem with a large logic load (large fan-out) of the clock net as the global clock
tree is designed for that.

page 45 kubicek@vutbr.cz
Slow clock signals

A better solution?
❑ Clock enabling – any lower frequency can be used (with a resolution of
primary clock period).

❑ Clock Management – using a dedicated clock conditioning blocks


available in FPGAs. Several clock signals with different frequencies can be derived.
The lowest frequency is usually limited to a frequency of about 1 to 10 MHz (the
clock conditioners are partially analog circuits based on PLL or DLL).

❑ Combination of the Clock Management and the Clock enabling techniques.

page 46 kubicek@vutbr.cz
Clock Enabling

page 47 kubicek@vutbr.cz
Slow clock signals

Clock Enabling D Q

All the Flip-Flops in the design (even those that should run on a slow CE

clock) share a common clock signal (usually of a relatively high


frequency). Switching of the Flip-Flops can be enabled/disabled (slow
down) using a dedicated Clock Enable signal.
Benefits: less clock domains, less synchronizers

clk

CE

page 48 kubicek@vutbr.cz
Slow clock signals

Clock Enabling

0
D 1 D Q D D Q
Clock Enable CE
Clock Enable clk
clk

Flip-Flops in most FPGAs feature a


Typical implementation of the CLOCK dedicated CE input ➔ no additional
ENABLE (CE) functionality. hardware (LUTs, routing) is needed for
the CE functionality.

page 49 kubicek@vutbr.cz
Slow clock signals

Clock Enabling: generate and use the CE Signal


clk
Main (system) clock signal 125 MHz

clk_EN
1:5 => 1/5 * 125 MHz = 25 MHz

clk_EN_gen: PROCESS (clk) BEGIN slow_proc: PROCESS (clk) BEGIN


IF rising_edge(clk) THEN IF rising_edge(clk) THEN
IF cnt_div = MAX THEN IF clk_EN = '1' THEN
cnt_div <= (OTHERS => '0'); ...
clk_EN <= '1'; END IF;
ELSE END IF;
cnt_div <= cnt_div + 1; END PROCESS slow_proc;
clk_EN <= '0';
END IF;
END IF;
END PROCESS clk_EN_gen;

page 50 kubicek@vutbr.cz
Slow clock signals

Clock Enabling
clk
Main (system) clock signal 125 MHz

EN_1
1:1 => 1/2 * 125 MHz = 62.5 MHz

EN_2
1:3 => 1/4 * 125 MHz = 31,25 MHz

EN_3
1:4 => 1/5 * 125 MHz = 25 MHz

page 51 kubicek@vutbr.cz
Slow clock signals

Wrong technique of using clock enable


Clock Gating in VHDL code ➔ combinatorial logic in the clock signal path

clk_switch: PROCESS (clk_EN, clk_in) BEGIN


IF clk_EN = '1' THEN
clk_sys <= clk_in;
ELSE
clk_sys <= '0';
END IF;
END PROCESS clk_switch;
D Q

clk_EN
Not for FPGAs!
clk_in D Q

0 clk_sys

page 52 kubicek@vutbr.cz
Slow clock signals

Allowed usage of CE
For Clock Gating it is necessary to use a dedicated glitch-free clock buffer with
enable input (not available in all FPGAs).
Library UNISIM;
use UNISIM.vcomponents.all;
...
BUFGCE_inst : BUFGCE
port map (
O => clk_sys, -- Clock buffer output
CE => clk_EN, -- Clock enable input
I => clk_in); -- Clock buffer input

page 53 kubicek@vutbr.cz
Slow clock signals

❑ Instead of generating very slow signals it is often much more efficient to use a
small microcontroller (a soft IP core) that can run on a relatively high clock
frequency (as the rest of the design).

❑ Any slow actions (delays) are software defined with no additional HW cost.

❑ It is way easier to write/modify/debug software (C or even Assembler) than


hardware (VHDL, Verilog) ➔ faster development.

❑ Once used the microcontroller can often adopt other task (especially a more
complex algorithmic ones) to offload the logic.

❑ The use of a microcontroller for such tasks usually results in a significant saving
of hardware resources (LUTs, Flip-Flops)

page 54 kubicek@vutbr.cz
Thank You for Your Attention!

Routing congestion analysis

You might also like