Professional Documents
Culture Documents
Pasc: Physically Platfor Authenticated Stable-C RM On Low-Cost Fpga Clocked Soc As
Pasc: Physically Platfor Authenticated Stable-C RM On Low-Cost Fpga Clocked Soc As
Clocked SoC
Platforrm on Low-Cost FPGA As
Aydin Aysu and Patrick Schaumont
Electriical and Computer Engineering Department
Virginia Tech
Blacksburg, VA, USA
e-mail: { aydinay, schaum }@vt.edu
Abstract— Generation of device-unique digitaal signatures using
Physically Unclonable Functions (PUFs) is an active area of
research for the last decade. However, most PUFs
P are conceived
and designed as stand-alone hardware modulees. In contrast, this
paper proposes a PUF architecture that is tighhtly integrated into
the core of a system-on-chip (SoC), with the pu
urpose of creating a
Figure 1. Critical timing Δt is the timme it takes for a signal to reach from the
physical SoC authentication mechanism. Thee proposed PUF is output of register R1 passing throughh the Data Path into the register R2. If
integrated into the custom instruction interfaace of the NIOS-II clock period τ < Δt, the critical timingg is violated on this path.
processor. Therefore, PUF challenges caan be issued by
instruction calls which allows run-time authenntication and which
enables implementation of flexible post-processsing mechanisms in
software. The proposed PUF utilizes crittical timing path
violations of a custom instruction execution to generate digital
signatures which are unique for individual ch hips due to random
process variations. We implement PASC on a low-cost Altera
DE0-Nano Development Board and we validatte the quality of the
authentication keys on 15 Boards.
Keywords—Physical Uncloneable Function
ns; System-on-Chip
Integration; HW/SW Co-design; FPGA
I. INTRODUCTION
Physical Unclonable Functions (PUFs)) utilize random
process variations during manufacturing of an Integrated
Circuit (IC) to generate device-unique elecctronic signatures.
The basic model of a PUF is a challenge/ressponse mechanism Figure 2. Generic structure of a PUF thhat can utilize critical timing violations
where the challenge is a distinct input and a method to trigger using a fixed clock source
the PUF and the response is the correspondding output. If the The major contributions of thiss work are:
responses are device-unique and unclonablee, then it could be • To propose a PUF architecture that can generate
used for applications such as bitstream protection
p [1] and device-unique responses at run-time, using only the
tamper-resistant key storage [2]. nominal clock frequenncy of the processor.
The core idea of the proposed PUF is to t use the critical
timing violation frequency (1/Δt) of data paths; this quantity is • To demonstrate a physically authenticated SoC
expected to be device-unique [3]. Fig. 1 showss the generic platform (PASC) thatt can do run-time authentication
structure of these PUFs. Initially, both R1 and
a R2 are set to a with the integrated PUF
P architecture. The PUF is
known state. At time 0, a value is launchedd from R1. After a integrated as a custom
m-instruction into the processor.
device-dependent, finite amount of time (Δt), this value The extraction of authhentication key, including all the
propagates through the datapath and is captuured at R2. To find required post-proceessing, can be completely
the exact value of Δt, we require a mechanissm that can sweep supported by softwarre. This solution combines the
the frequency of the clock input of R1 andd R2 to determine uniqueness of hardw ware with the flexibility of
when the expected value of R2 is not satisfiedd [3]. software.
Fig. 2 shows the generic architecture of a PUF that avoids The rest of the paper is orrganized as follows: Section II
the need for clock sweeping. Instead using one data path, we gives an overview of the literatture of PUFs on FPGAs and our
use several data paths which have different ΔtΔ values. If we set motivation to design a new onne. Section III demonstrates the
the critical timing of these paths as Δt1<Δtt2<Δt3<…<Δtn we principle of operation. Sectioon IV presents our PUF, its
could use a stable clock frequency and cheeck at which data integration to the SoC architeccture, and some SW-based post-
path, executions start to fail. The proposedd PUF structure is processing methods to generatee authentication keys out of raw
built on this key idea. The sweeping mechhanism is replaced PUF responses. Section V quantifies
q the uniqueness and
with a custom instruction executed by a proocessor that causes reliability of the authenticationn keys. Section VI concludes the
critical timing violations at the nominal clockk frequency. paper.
Figure 3. The principle of operation for PUF construction in [3] (a) and the proposed (b). In [3], the input clock frequency is swept and the chip identity is measured
as a function of frequency, whereas the proposed construction uses a fixed clock input and the chip identity is measured as a function of percentage of correct
executions.
frequency, and the Y axis represents thee proportion of send from the processor using the custom instruction
correct executions. Due to process variatiions, instruction interface. The value dataaa sent from the processor is
executions will have a different critical timing t violation registered when the value off synchronization signal start is
frequency on each device. We can check thhe correctness of ‘1’. In one clock cycle, first, the
t values of these input registes
the return value of instruction by comparring it with the are fed into the delay-registerr chain and then stored inside 32
golden value. Therefore, this work sweepss the frequency registers.
range of the clock input to find at whichh frequency the It takes a finite amount of
o time to read out the value of a
processor fails to execute instructions. The key observation register and to transfer it to the delay-register chain
here is that, it is not feasible to utilize nativee instructions of 0 , to propagate it through thhe chain 0 and to write
a processor under a fixed clock input, becauuse the variation it into a register 0 . , , are device-unique
of the % of correct executions are too widde and it drops random values. If the totaal amount of time required to
down rapidly to 0% causing most of the chipps with the same complete this path is shorter than the operating clock period,
violation behavior. For example, if we seleect 100 MHz as a critical timing violation occurs.
o Equation (1) formulates
the fixed frequency of the input clock in Fiig. 3 (a), Chip 3 this condition where τ denootes the clock period. If (1) is
and Chip 4 will have different % of correctt executions, but satisfied, the values of somee output registers might not get
Chip 1 and Chip 2 will both have 100% correct executions updated due to timing violatioon.
and Chip 5 and Chip 6 will both havve 0% correct
executions.
τ 1
Fig. 3 (b) shows the expected behavior of o the proposed
stable clocked PUF. We use a fixed input clock and a
custom instruction. The custom instructionn is designed to
By choosing the numberr of LUTs to generate the chain
drop down slowly from 100% correct executions to 0%
carefully (the example in the figure assumes 48 LUT, they
correct executions. In contrast to the behavvior in Fig. 3(a), are LUT_0 to LUT_47) , wee can set the value to satisfy
the chips in Fig. 3(b) would have differennt % of correct the conditions of (2) and (3).
executions at the fixed clock frequency of 1000 MHz.
IV. PHYSICAL AUTHENTICATTION
τ 2
The proposed physical authentication solution is not an
obvious combination of PUF and softwarre, but rather a
system-level, tight integration of both. In this
t section, we
first discuss the PUF structure followed by its system
τ 3
integration and how to process the raw PU UF responses to
generate the authentication keys.
A. Novel PUF If (2) and (3) are satisfiedd we can utilize the fact that the
Fig. 4 shows the architecture of a propoosed PUF block. value of the th register willl be updated properly whereas
The PUF block architecture consists of a seriial delay-register the value of the (n+1)th reegister will not, because of the
chain. In our architecture, we configured the LUT_0 to critical timing violation. Since , , are device-
LUT_47 as a buffer. The output of LUT_16 to LUT_47 is unique random values depennding on process variation, the
captured in a register. These 32 registers generate a 32-bit value of n will be random.
output response that could be sent back to t the processor Now we define how to formalize
fo the challenge response
using the custom instruction interface. Thee complete PUF pairs and the rationale behinnd this construction. If the PUF
architecture consists of 64 copies of the PUF
P block from is not activated by the proceessor, all registers are preset to
Figure 4. ‘0’. The challenge dataa inpuut is a value where 1-bit input is
The input of the PUF block is 1-bit inputt dataa which is set to ‘1’.The scan-register chain enables us to generate a
Figure 4 Architecture of a proposed PUF block. 64 of thhese blocks are used to generate the authentication key.
Figure 6. High-level PUF block diagram