Home Computer Science Hardware Security and Trust: Design and Deployment of Integrated Circuits in a Threatened Environment
Implementation of Delay-Based PUFs on Altera FPGAs
Linus Feiten, Matthias Sauer and Bernd Becker
Altera is—besides Xilinx—the largest manufacturer of fleld-programmable gate arrays (FPGAs) and their devices are widely used. Over the years, there have been several variants of the Cyclone FPGA series. The first version based on 130 nm process technology was introduced in 2002, followed by the Cyclone II (90 nm, 2004), Cyclone III (65 nm, 2007), Cyclone IV (60 nm, 2009) and Cyclone V (28nm, 2011). Despite advancements from version to version, general architectural concepts are sustained, setting Altera FPGAs apart from the Xilinx architecture. In the course of this chapter, the reader will be introduced to the Cyclone architecture and be enabled to put the concepts of delay-based PUFs  into practice. This is done using the Altera design software Quartus II and the hardware description language VHDL. The communication between the FPGA and a PC is done via the JTAG interface using the scripting language TCL and custom commands provided by Altera SignalTap II. Beforehand, however, a short summary of delay-based PUFs is given and their application in FPGAs as opposed to application-specific integrated circuits (ASICs) is put into context.
The purpose of a PUF is to have a unique signature (typically in form of a binary number) associated with each device, that is generated from the device’s unique physical characteristics. This signature can be used to tell the device apart from other devices or even be part of a cryptographic protocol to allow only the device
L. Feiten (B) • M. Sauer • B. Becker
© Springer International Publishing Switzerland 2017 N. Sklavos et al. (eds.), Hardware Security and Trust, DOI 10.1007/978-3-319-44318-8_11
owner access to certain resources. Instead of a PUF generating such signature, it could simply be stored in the non-volatile memory of a device. However, an attacker might physically access this memory with comparatively little effort (e.g. [6, 17]). The advantage of a PUF is that the signature is only generated when needed and stored temporarily, making an attack much more difficult. Furthermore, as the PUF signature is generated from physical characteristics, an invasive attack on the device is likely to disturb these characteristics such that the signature is altered.
Whether the PUF signature can be read-out directly from the device or whether it remains concealed within it depends on the implementation. In the former case, an attacker might read-out the signature and possibly forge a device with the same signature; in the latter case—if the signature is only used inside of the device, e.g. as the seed for an asymmetric cryptographic key pair —an attacker has to go through much greater efforts to possibly obtain the secret signature. Using the PUF output as a cryptographic seed, however, requires perfect reliability, because just a single bit-flip in the seed leads to a completely different key. As perfect reliability is hardly achieved in any PUF, error correction schemes  must be used. If the signature is read-out directly, non-perfect reliability can be compensated by considering a signature as “correct” when enough signature bits have their expected values.
To prevent an attacker from learning the whole signature of a device—intending to forge an identical device—so-called strong PUFs  have the potential of generating a signature of exponential length. To identify a device, only a subset of the whole signature is sampled, depending on a challenge. An attacker is hampered, because there are too many challenges to read out the complete signature, and he does not know which challenges are used by the legitimate owner to identify the device. Weak PUFs on the other hand, only generate a manageable amount of signature bits which could—if the signature is not concealed—be read out and stored by an attacker. Whether an attacker is able to forge the signature of a device depends on the technology and the PUF. Mostly, this will prove rather difficult as the physical device characteristics depend on uncontrollable variances in the production process. Only with extensive effort would it be possible to, e.g. alter the capacitances of single transistors that a PUF yields its responses from.
The implementation of PUFs on FPGAs—as opposed to ASICs—brings some peculiarities with it as their functionality is not hardwired. Instead, their reconfigurable hardware is configured to realise any feasible functionality being encoded in a so-called bitstream file. The bitstream is loaded to volatile memory elements of the FPGA any time it is powered up. We call such an FPGA configuration an FPGA design. During the creation of a design, the designer uses a hardware description language like VHDL or Verilog to describe the netlist defining all gates, memory elements and interconnects. The FPGA vendor’s design software then maps those elements to the configurable FPGA hardware and compiles the distributable bitstream.
While the netlist and its mapping to the FPGA hardware is still “human-readable”, the bitstream is definitely not. In fact, it is considered a security feature when the netlist cannot be reverse-engineered from the bitstream. FPGA vendors therefore make it a secret how their bitstream compilers really work and try to obscure how the bitstream relates to the actual FPGA configuration. However, there have been attempts to break this kind of “security by obscurity” , which is why some newest FPGA types provide bitstream encryption, where the bitstream is encrypted with a secret key that is also stored in a battery-powered memory of the FPGA.
There are several reasons why the bitstream should not be reverse-engineerable. The first being that the copyright owner of an FPGA design put a lot of research and development costs into its creation. While the netlist can be kept as a business secret, the bitstream cannot as it must be distributed in a file or non-volatile memory connected to the FPGA. Both can be easily accessed. When a business competitor manages to reverse-engineer a regularly purchased bitstream, he could reuse it for his own products evading the expensive research and development costs. Another scenario in which the bitstream should not be reverse-engineerable is the implementation of PUFs. Because if all placement and routing details of the PUF circuitry are known to an attacker, it is a lot easier for him to manipulate only the relevant hardware components to make the PUF produce another signature. Furthermore—in case of a concealed weak PUF—he could compile an FPGA design with just the same but non-concealed PUF.
Given that the bitstream cannot be reverse-engineered, a customer is able to programme an arbitrary amount of FPGAs with it. Such “overproduction” can be harmful to the business of an FPGA design vendor, who might want to sell bitstreams with licenses for limited usage. Bitstream encryption alone does not prevent this, as the customer must also possess the key to decrypt the bitstream. With a PUF, however, the vendor can link bitstreams to specific FPGAs by encoding the expected PUF signature into the logic of the design. Thus, the implemented hardware will only start functioning on an FPGA where the PUF produces the expected signature . To link a bitstream to an FPGA, the vendor must have access to the customer’s FPGA once in order to sample its unique signature.
There are many different kinds of PUF implementations on FPGAs; e.g. arbiter PUFs , butterfly PUFs , TERO-PUFs  or SRAM-PUFs [1, 7]. This chapter focuses on delay-based PUFs in general and how to implement them on Altera Cyclone FPGAs. The output of a delay-based PUF is determined by the delays of certain circuit lines. The delay determines how long a change from high voltage (logic 1) to low voltage (logic 0) or vice versa takes to travel through a conducting circuit line. In the production process of integrated circuits, unavoidable process variations lead to slightly different delays on each individual chip. For instance, the higher the resistance of a line, the greater its delay. Thus, these delays can be used as device-specific physical characteristics from which the PUF signature is generated. The delays are also highly dependent on operating temperature and voltage. Delay- based PUFs generally compensate such fluctuations by using the relative differences between circuit lines instead of absolute measurements.
A very popular kind of delay-based PUFs is the ring oscillator PUF (RO-PUF) that has received much research attention due to its relative simplicity and stability (e.g.
L. Feiten et al.
Fig. 11.1 A single ring oscillator (RO)
Fig. 11.2 An RO-PUF with m ROs, multiplexers, counters and comparators
[5, 10, 19-21]). We will therefore take it as the running example for this chapter, but the demonstrated methods can be used to implement any kind of delay-based PUF. The core of an RO-PUF is a set of ring oscillators (ROs). Figure 11.1 shows the circuit diagram of an RO consisting of a single NAND gate and a number of delay elements (e.g. buffers). The delay elements are just passing on the signal. Sometimes also an odd number of inverters is used as delay elements. When the enable input is set to 1, the RO starts oscillating; i.e. a constant alternation between 1 and 0 is observable at the output. The oscillation frequency depends on the delay of the circular path, which is different for each device.
Figure 11.2 shows the diagram of a classical RO-PUF circuit . The challenge selects two ROs for comparison and passes their oscillating signals to the counters. The counterfed by the faster RO eventually holds the greater value and the comparator returns 1 or 0 as the PUF response for this particular challenge. If the relative differences between the ROs are different on each device, each device has its unique signature.
As each response bit is created by comparing two ROs, there is a total of p'(p,-1) bits. But because there are only p! possibilities for p ROs to be ordered by their frequencies, the entropy of the RO-PUF’s signature is in fact log2( p!) . The authors of  introduced a post-processing procedure to generate even more signature bits from the same amount of ROs, but for simplicity we regard the standard method here. Thus, the RO-PUF belongs to the class of weak PUFs, as it does not generate an exponential amount of response bits. The term weak does not imply that the PUF is security-wise broken but rather implies that the PUF response should be concealed and impossible to be read out directly; otherwise an attacker might copy all the values and forge a device with the same response.
The remainder of this chapter is organised as follows. Section11.2 introduces the Altera Cyclone FPGA architecture and explains how ROs are mapped to it. Section 11.3 describes in a step-by-step manner how to implement all parts of a functioning RO-PUF using the Quartus II design software and VHDL. It furthermore explains how the placement and routing of the RO circuitry must be enforced instead of leaving it to the compiler. Section 11.4 shows how to establish the communication between the FPGA and a computer via the JTAG interface. In Sect. 11.5, we share what furthermore has to be taken care of to ensure the PUF quality before Sect. 11.6 concludes the chapter.
|< Prev||CONTENTS||Next >|