Lee 2019

This article has been accepted for publication in a future issue of this journal, but has not been
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCE.2019.2942503, IEEE
Transactions on Consumer Electronics
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1
A 20-Gbps Receiver Bridge Chip with Auto-skew

Calibration for MIPI D-PHY Interface
Pil-Ho Lee and Young-Chan Jang, Member, IEEE
 Rx D-PHY Specification Version

Abstract—A 20-Gbps receiver bridge chip featuring auto-skew D-PHY v2.0 D-PHY v1.1 D-PHY v1.0
calibration and continuous-time linear equalization is proposed to
Initialization
Initialization
Initialization
Max Speed
Max Speed
Max Speed
Deskew
Deskew
Deskew
support the mobile industry processor interface D-PHY version
(Gbps)
(Gbps)
(Gbps)
2.0 specification with four data lanes and one clock lane. The
proposed receiver bridge chip performs byte synchronization and
1-to-8 deserialization for converting high-speed scalable
D-PHY D-PHY D-PHY

low-voltage signals into low-speed low-voltage complementary
v1.1
Specification Version
1.5 - 1.5 - 1.5 -
metal-oxide semiconductor signals. The proposed auto-skew
calibration has a simple architecture and is insensitive to dynamic
Tx D-PHY
noise owing to the use of the multiple bits supplied from the 2.5 Yes 2.5 Yes
v1.2
deserializer as a result of the phase detector for the skew 1.5 -
1.5 - 1.5 -
calibration. It is performed via a four-step sequential process to
use the minimum time delay. The proposed receiver bridge chip is 4.5 Yes 2.5 Yes
v2.0
implemented using a 0.11 μm CMOS process with a 1.2 V supply. 1.5 -
1.5 - 1.5 -
The measured peak-to-peak time jitter of the signal recovered
using the proposed receiver is 50 ps at a data rate of 5.0 Gbps/lane Skew calibration required at data rates over 2.5Gbps
on a printed circuit board FR-4 10 inch channel. The proposed Fig. 1. Data rate per lane and deskew supported by each version of MIPI
skew calibration reduces the time skew among the four data lanes D-PHY.
and one clock lane to less than 10 ps. insertion loss in the high-speed chip-to-chip interface [3].
Equalization for a receiver circuit is usually implemented using
Index Terms—mobile industry processor interface (MIPI),
D-PHY, skew calibration, continuous time linear equalization, feed forward equalization (FFE) and decision feedback
byte synchronization, deserialization. equalization (DFE) [4], [5]. As the channel noises such as
insertion loss, cross-talk, and reflection increase, FFE and DFE
I. INTRODUCTION schemes can be used together to improve the signal integrity [5].
However, the signal integrity of mobile interfaces with
A S the technology of mobile display and camera has
recently developed, frame rate, color depth, and resolution
have rapidly increased. To satisfy these high-end demands, a
relatively short channels compared to other interfaces can be
improved using only the FFE scheme, such as continuous-time
new version of the mobile industry processor interface (MIPI) linear equalization (CTLE) with a 1-tap pre and post-cursor.
D-PHY is being released, as shown in Fig. 1 [1]. The MIPI This results in reducing area and power consumption compared
D-PHY is used for display and camera interfaces in mobile to the case of using two equalizers together.
applications with a short channel. The MIPI D-PHY version 1.0 Additionally, skew calibration has been studied in order to
specification supports data rates up to 1.5 Gbps/lane without compensate for the mismatch of channels and transmitter and
additional design techniques to improve the signal integrity [2]. receiver circuits in the high-speed parallel interface [6]–[9]. In
However, the MIPI D-PHY version 2.0 specification includes a previous study [7], skew compensation was performed using
design techniques such as equalization and skew calibration to a multi-phase clock for a 90° phase shift of the strobe signal in
compensate for channel insertion loss and skew among signaling with a double data rate. In the conventional dynamic
channels to supports data rates up to 4.5 Gbps/lane. random-access memory (DRAM) interface using the parallel
In general, equalization for transmitters and receivers is used interface, a delay-locked loop (DLL) is commonly used as a
to reduce inter-symbol interference noise due to channel multi-phase clock generator for skew calibration between data
and strobe signals [10], [11]. However, the multi-phase clock
This work was supported by the Priority Research Centers Program generator increases the hardware and design complexity for the
(2018R1A6A1A03024003) through the National Research Foundation of skew calibration. Furthermore, in one study [8], skew
Korea (NRF) funded by the Ministry of Education and the MOTIE (No. calibration was repeatedly performed between the two channels
N0001883, HRD Program for Intelligent semiconductor Industry).
P.-H. Lee is with the Department of Electronic Engineering, Graduate
to eliminate the time skew in the parallel interface. This skew
School, Kumoh National Institute of Technology, Gumi, Korea (e-mail: calibration method can increase the area and power
leepilho@kumoh.ac.kr). consumption for the multi-channel skew calibration.
Y.-C. Jang was with the Samsung Electronics, Hwasung, Korea. He is now
with the School of Electronic Engineering, Kumoh National Institute of
In this work, a receiver bridge chip, which can acquire the
Technology, Gumi, Korea (e-mail: ycjang@kumoh.ac.kr). high-speed serial data in a field programmable gate array
0098-3063 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCE.2019.2942503, IEEE
(FPGA)-based frame grabber [4], [12], [13], is proposed to LP_TX/RX LP_TX/RX LP0INP/M
fully support the D-PHY version 2.0 specification for the MIPI DA
DO0OUT[7:0]
CSI-2 [14]. The proposed receiver bridge chip is used for the D0INP/M HSRX Deskew Deserializer
CLKOUT
high-speed interface link between the MIPI embedded chip and DSK0[4:0]
Byte
the FPGA chip, similar to the receiver bridge chip presented in Sync
BSYNC
[4]. Thus, it performs not only the level-shifting function but Lane0
also 1-to-8 de-serialization with byte synchronization to reduce LP_RX LP_TX LP1OUTP/M
LP2OUTP/M
the parallel interface speed for the FPGA. Furthermore, a LP3OUTP/M
D1INP/M LP_TX/RX
HSRX Deskew Deserializer DO1OUT[7:0]
receiver with CTLE suitable for compensating for the channel
D2INP/M DO2OUT[7:0]
insertion loss in mobile applications is used in the proposed D3INP/M
Lane1 DSK1[4:0]
DO3OUT[7:0]
Lane2
receiver bridge chip for scalable low voltage signaling (SLVS) Lane3
with a data rate of 5 Gbps/lane exceeding the MIPI D-PHY
version 2.0 specification. Auto-skew calibration with small Skew
Calibration
hardware and low power consumption is proposed for
multi-channel mobile applications. Section II presents the CLKINP Clock
HSRX
CLKINM Deskew CLKA
operation, including the block diagram of the proposed receiver DSKC[5:0]
Clock Lane
bridge chip. Additionally, it describes a common-gate level
Fig. 2. Block diagram of proposed receiver bridge chip supporting MIPI
shifter (CGLS) with CTLE for the receiver circuit of the SLVS D-PHY version 2.0 specification.
interface and skew calibration using the training pattern
supported by the MIPI D-PHY protocol. Section III presents
the measurement results for the fabricated receiver bridge chip CLKINP/M
and the FPGA-based frame grabber using the proposed receiver
bridge chip. Finally, Section IV concludes the paper. DINP/M
II. PROPOSED 5-GBPS/LANE RECEIVER BRIDGE CHIP 16'hFFFF 0 1 0 1 0 1 0 1 0 1 0 1
The proposed MIPI D-PHY receiver bridge chip consists of Fig. 3. Training pattern for skew calibration of MIPI D-PHY version 2.0.
four data lanes performing high-speed and low-power modes, a
clock lane, and a skew calibration block, as shown in Fig. 2. CLKA to four data lanes. The programmable delay circuit is
This chip is similar in configuration to a previously reported used to optimize the sampling time in spite of the time skew
receiver bridge chip that supports the MIPI D-PHY version 1.2 between the data lanes. Furthermore, the sampling clock CLKA
specification [4]; however, it features equalization suitable for a is used to generate a frame clock CLKOUT. The frame clock
5 Gbps/lane receiver circuit and multi-channel auto-skew CLKOUT is a clock signal that is synchronized in the signal
calibration to support the MIPI D-PHY version 2.0 BSYNC and frequency-divided by 1/4 of the sampling clock
specification. For high-speed-mode operation, the proposed CLKA. Thus, eight parallel data of each data lane (DO0OUT[7:0],
receiver bridge chip performs 1-to-8 deserialization including DO1OUT[7:0], DO2OUT[7:0], and DO3OUT[7:0]) and the frame
byte synchronization to convert high-speed SVLS signal into clock CLKOUT are output to the external FPGA chip with a
low-speed low-voltage complementary metal-oxide single data rate using the source synchronous clock scheme.
semiconductor (LVCMOS) signals. The skew calibration presented in the MIPI D-PHY version
Each data lane consists of a high-speed receiver (HSRX) 2.0 specification is initialized by the synchronous sequence
supporting the interface of the SLVS, a programmable delay (16'hFFFF) in the high-speed mode of the MIPI D-PHY and is
circuit (Deskew) to reduce the delay difference among four data performed using the same toggle pattern as the clock lane, as
lanes and a clock lane, and a 1-to-8 deserializer (Deserializer) shown in Fig. 3 [1]. In this case, the minimum data length of the
with a byte synchronizer (Byte Sync) for frame-lock operation. toggle pattern is defined as 210 unit intervals (UIs). The
Furthermore, each data lane includes a transmitter and a synchronous sequence entered in the first data lane activates the
receiver (LP_TX and LP_RX) for low-power-mode operation skew calibration mode by activating the signal BSYNC to
using LVCMOS signaling. The transceiver for synchronize all data lanes. Then, the skew calibration circuit
low-power-mode operation of the first data lane supports a detects the 8-bit output of the toggle pattern of all data lanes and
bidirectional interface. However, the other data lanes perform controls the programmable delay circuit (Deskew). The time
only the receiving function of low-power-mode operation. skews among data and clock lanes are compensated by
Furthermore, the block Byte Sync is only included in the first controlling the block Deskew in each data lane. The
data lane. Its result BSYNC is supplied to other data lanes and programmable delay circuit in the clock lane (Clock Deskew)
the clock lane, and is also output to the FPGA such that the performs a 90° phase shift relative to the data to obtain the
FPGA senses the byte synchronization of the MIPI CSI-2 maximum time margin in the signaling with double data rate.
interface. The clock lane is used to receive a clock signal for the The proposed auto-skew calibration can compensate for the
source-synchronous clock scheme in high-speed-mode time skew of the data and clock caused by the channel
operation. It consists of a high-speed receiver and a mismatch and process-voltage-temperature (PVT) variations
programmable delay circuit, and it supplies a sampling clock through the aforementioned process. It is also performed
DA0 DA0 high

Data (DA)
with time jitter
DA1 DA1 low
DA2 DA2 Gaussian distribution
PDF of transition (standard deviation:σ)
DA3 DA3
due to time jitter
CLKA CLKA
-2σ +2σ
(a) (b)
ΔT
ᴕ
high
VOUT_PD
@ using 1-bit result
DA0 DA0 low
Uncetainty region
DA1 DA1
high Increase of N ( ΔTN )
ᴕ
DA2 DA2 VOUT_PD (effective)
@ using N-bit result
DA3 DA3 low
scan Fig. 5. Average output of phase detector for data with time jitter according to
CLKA CLKA
use of N-bit result for phase detection.
(c) (d)
Fig. 4. Sequence of auto-skew calibration (a) initial delay of clock (b) skew calibration, the effective average output of the phase detection
calibration among data (c) scan of 1-UI interval of data through clock delay (d) is determined to be a low value when all the multi-bit (N-bit)
movement of clock to center of 1-UI interval of data.
results supplied from the 1-to-8 deserializer in each lane are
automatically using the predetermined training patterns without low, as shown in Fig. 5. The effective average output of the
requiring any external control phase detection becomes proportional to ΔTN. As the number of
multi-bits used for the phase detection increases, the
A. Auto-skew Calibration uncertainty region of the effective output of the phase detection
The proposed auto-skew calibration compensates for the decreases. Therefore, the proposed auto-skew calibration is
time skews among the four data and the clock in the MIPI insensitive to dynamic noise such as time jitter, in contrast to
D-PHY interface. The basic design strategy of the proposed the DLL using a single phase detector. Furthermore, the
auto-skew calibration is to minimize the time delay of the clock proposed auto-skew calibration does not suffer from the offset
lane and each data lane required for the skew calibration. To of a phase detector because it uses the results of the 1-to-8
implement this design, the proposed auto-skew calibration is deserializer in each lane for high-speed-mode operation for
performed according to the four-step sequential process shown phase detection. The proposed auto-skew calibration can also
in Fig. 4. The first step of the auto-skew calibration is to adjust be implemented as a simple structure while ensuring a stable
the delay of the clock so that all data are sampled to a low value window of the data from the results of the sampling of one UI
at the rising edge of the clock, as shown in Fig. 4(a). Next, each of the data through the delay of the clock.
data lane is sequentially delayed and adjusted to the clock edge, Figure 6(a) shows a block diagram of the proposed
as shown in Fig. 4(b). The adjusted data lanes are aligned based auto-skew calibration, which consists of a synchronous
on the rising edge of the clock; thus, the time skews among four sequence detector (SSD) to detect the start of the proposed
data lanes are compensated. Subsequently, the position of one auto-skew calibration, a clock skew controller to control the
UI where the data is held low is examined by sequentially delay of the clock, four data skew controllers to control the
delaying the clock, as shown in Fig. 4(c) [15]. Finally, the delay delay of each data, and other control logic. The skew
corresponding to half of the scanned data 1-UI is set as the calibration mode is activated when the synchronous sequence
clock delay; thus, the rising edge of the clock is located at the (16'hFFFF) is detected using the block SSD in the high-speed
center of the data, as shown in Fig. 4(d). The aforementioned mode and the signal SKEWST is activated. Then, it is performed
process compensates for the time skews among the four data using the repeatedly received toggle pattern which is a data
and the clock, and ensures the maximum time margin for the pattern supplied for the skew calibration, as shown in Fig. 3.
stable acquisition of the received high-speed data. The proposed auto-skew compensation is accomplished
The skew calibration is generally performed by phase through independent control of the programmable delay line of
detection and delay control, similar to the operation of the DLL. each lane. The programmable delay line for each lane is
For the phase detection, the proposed auto-skew calibration controlled by comparing the deserialized 8-bit data with the
uses the multi-bit, 8- or 32-bit, data supplied from the 1-to-8 expected data for the toggle pattern of the four data lanes.
deserializer in each lane for high-speed-mode operation instead Figure 6(b) shows a timing diagram of the proposed auto-skew
of the result of a separately added phase detector. When the calibration operation. The first step is to determine the initial
input data of the phase detector has a time jitter whose delay of the clock by using the blocks Initial Clock Delay
probability density function (PDF) is Gaussian with a standard Detector, 6-bit COUNTER, and zero-data pattern detector
deviation of σ, the average output of the phase detector is (DPD0) in the block Clock Skew Controller. The block DPD0
linearly proportional to the phase difference between the input detects a data pattern of 8’b10101010. At this time, the signal
data and the sampling clock ΔT for |ΔT| < 2σ, as shown in Fig. 5 INIT_CLK is set to a high value by SCAN_P0 and the initial
[16]. The time zone of |ΔT| < 2σ is the uncertainty region, and delay of the clock is determined as the minimum delay value for
the probability that the phase detector outputs a faulty result sampling a data pattern of 8’b10101010 in the four data lanes,
during this region increases. In the proposed auto-skew as shown in Fig. 4(a). In the second step, the block 5-bit
5-bit
DPD1
DO0OUT[7:0] (8-bit)
COUNTER DSK0[4:0]
EN
DO1OUT[7:0] DSK1[4:0]
DO2OUT[7:0] Data Skew DSK2[4:0]
DO3OUT[7:0] Controller DSK3[4:0]
32
(a) (b)
INIT_CLK
SSD SKEWST
CLKOUT
VDD SCAN_EN
FF FF
Initial Clock
Delay Detector EN
(c) (d)
6-bit Fig. 7. Simulated results at 5 Gbps/lane (a) initial skew of clock and data (b)
COUNTER initial delay clock (c) skew calibration of data (d) skew calibration of clock.
S[5:0]
1
SCAN_P0 6-bit DSKC[5:0] reduced to 5 ps and 3 ps respectively by performing the
D-F/F
DPD0 (A) 0 proposed auto-skew calibration.
(32-bit) 6-bit
DPD1
D-F/F B. Programmable Delay Line for Deskew
(B)
(32-bit) 90° Phase
SCAN_P1
shifter
The programmable delay line is used in each data lane and
Clock Skew Controller
the clock lane to reduce the time skew. It consists of a
(a)
combination of a coarse delay line and a fine delay line to
implement a high resolution and wide rage while reducing the
Detect sync-sequence End of HS-mode absolute delay time, as shown in Fig. 8(a). Additionally, a
SKEWST
deglitch circuit is proposed to eliminate the glitch noise that can
CLKOUT
be generated in the programmable delay line. The coarse delay
S[5:0]
SCAN_P0 Sample A line adjusts the delay time by controlling the number of NAND
SCAN_P1 Sample B gates used for the signal path [17]. The controllable time
INIT_CLK Initial clock
Data lane align, resolution, which is determined as the delay time of two NAND
1-UI scan start
SCAN_EN delay detect
gates, is approximately 60 ps, and the total adjustable delay
Lane3
DSK3[4:0] DPD1-high time is approximately 475 ps. The fine delay line consists of
Lane2
DSK2[4:0] DPD1-high
Lane1
inverters with capacitive loads controlled using a three-bit
DSK1[4:0] DPD1-high
Lane0
binary code of DSK[2:0]. The controllable time resolution and
DSK0[4:0] DPD1-high
DSKC[5:0] (A+B)/2
maximum controllable delay time of the fine delay line are
First step Second step 1-UI scan
90° phase approximately 7.7 ps and 60 ps, respectively. For maintaining
shift
(b)
the monotonicity of the programmable delay line, the
Fig. 6. Proposed auto-skew calibration (a) block diagram (b) timing diagram. maximum controllable delay time of the fine delay line should
not exceed the controllable time resolution of the coarse delay
COUNTER activated by the signal INIT_CLK increases the line.
delay of Deskew in each data lane so that the data skew Glitch noise can be generated at the output of the
calibration is performed. This process is performed until the programmable delay line when the signal path of the NAND
block DPD1 of the data skew controller of each data lane gate activated by the control of the coarse delay line of the
generates a high output, aligning all the falling edges of the four programmable delay line is abruptly changed. Glitches are
data to the rising edge of the clock, as shown in Fig. 4(b). The unintentional transitions or short pulses that occur in the signal
block DPD1 is a one-data pattern detector for detecting the data before being stabilized in a digital circuit. The glitch noise
pattern of 8’b01010101. Then, the 1-UI window of the data is generated in the skew-calibration mode causes errors in data
examined using the blocks DPD0 and DPD1 in the block Clock sampling by generating a phase shift of the clock signal during
Skew Controller and controlling the delay of the clock in the frequency division in the clock lane. Consequently, the skew
period of 1-UI scan in Fig. 6(b). Finally, the rising edge of the calibration is not performed normally. The proposed deglitch
clock is centered in the valid data window according to the circuit removes glitch noise from the output of the
operation of the 90° Phase Shifter in the block Clock Skew programmable delay line by creating a disable window on the
Controller at the time of End of HS-mode, as shown Figs. 4(d) output of the coarse delay line to prevent this error. Figure 8(b)
and 6(b). shows a timing diagram of the proposed deglitch circuit.
Figures 7(a) to (d) show the simulated results of the proposed Because the coarse delay line is controlled in the direction in
auto-skew calibration at a data rate of 5 Gbps/lane. When the which its delay time is increased by the skew-calibration
worst time skew among four data is 110 ps and the average time process, the glitch noise of the clock signal is caused by the
skew between the data and the clock is 105 ps, these values are change of DSKC[3], i.e., the least significant bit (LSB) of the
FINE DELAY LINE
FINE DELAY
FINE DELAY UNIT

UNIT
/DSK[2:0]
N DO U T
OUT
DSK[2:0]
COARSE DELAY LINE

/SD[0] /SD[6]
IN
SD[0] SD[6] SD[7]
Fig. 9. Simulated results of deglitch circuit.

CLK
1S T A m p l i f i e r 2N D A m p l i f i e r
VDD
C L KB L K
DSKC[3] Shift V 1O U T P VO U T P VO U T M
D-F/F V 1O U T M
Register
BIAS1
DSKC[5:4] SD[7:0] VDD
2-bit DEC EQ_CON2[2:0]
D-F/F
EQ_CON1
DEGLITCH CIRCUIT R
VI N P
(a)
RT E R M C
Present delay Next delay
IN
RT E R M EQ_EN
CLK with glitch CTLE
N DO U T VI N M BIAS
BIAS2
DSKC[3]
C L KB L K
SD[7:0] (a)
(b)
Fig. 8. Programmable delay line (a) circuit diagram (b) timing diagram of
deglitch circuit.
control code of the coarse delay line in the skew-calibration

process. Therefore, the disable window signal CLKBLK is
generated from the transition of DSKC[3]. As shown in Fig.
8(b), the signal CLKBLK is generated by the shift register and the
XOR gate from DSKC[3]. The glitch noise is removed by
disabling the output NAND gate using the signal CLKBLK in the
interval where glitch noise can occur. Consequently, the output
signal of the deglitch circuit, i.e., NDOUT, becomes a low state (b)
during a period in which glitch noise can occur in the output Fig. 10. High-speed receiver with CTLE (a) circuit diagram (b) frequency
signal of coarse delay line (CLK), resulting in a stable signal response.
with no glitch noise. Figure 9 shows the simulated results for
[18]. Figure 10(b) shows the simulated frequency response of
the delay update of the clock signal with a frequency of 2.5
the high-speed receiver. The first amplifier has a voltage gain of
GHz, verifying the operation of the deglitch circuit of the
5.7 dB at approximately 2.5 GHz, and the second amplifier
programmable delay line. When the control code of the delay
reduces the voltage gain by 5.3 dB at the low frequency. Thus,
line, i.e., SD[1], is updated, the glitch generated in the signal
the high-speed receiver performs equalization of 11 dB at a
CLK is removed from the output signal NDOUT.
frequency of 2.5 GHz.
C. High-speed Receiver with CTLE
The high-speed receiver (HSRX) of each data lane, which III. CHIP IMPLEMENTATION AND MEASUREMENT RESULTS
receives an input signal of SLVS, consists of two amplifiers, as The proposed receiver bridge chip was implemented using a
shown in Fig. 10(a). The first amplifier is a CGLS with a CTLE 0.11 μm CMOS process with a 1.2 V supply. Figure 11 shows
using pre-emphasis [4]. The second amplifier, which consists the microphotograph and layout of the fabricated chip. The
of a differential amplifier with source degeneration, performs total area of the receiver bridge chip including pads is 2.21 mm
de-emphasis by reducing the voltage gain at the low frequency × 2.21 mm and the active area of each data lane is 0.069 mm2.
(a)
(a) (b)
Fig. 11. Implemented chip (a) microphotograph (b) layout.
The power consumption of each data lane, including the

high-speed receiver, auto-skew calibration circuit, and
deserializer, is approximately 5.6 mW/Gbps. The quad flat
no-leads package with 88 pins was used in the implemented
receiver bridge chip. In the interface for a camera module, 10 (b)
(c)
pins are used for the four data lanes and one clock lane of the Fig. 12. Measured eye diagram of high-speed receiver at 5 Gbps/lane on PCB
MIPI D-PHY interface. In the LVCMOS interface between the FR-4 10 inch (a) input signal (b) output signal w/o equalization (c) output
receiver bridge chip and the FPGA, two clock pins, 32 signal w/ equalization.
parallel-data pins, and eight low-power-data pins are used.
Other pins are used for the power supply and control signals.
Figure 12 shows the signal-integrity performance of the
high-speed receiver. This measurement was conducted by
supplying data with a data rate of 5 Gbps/lane, a common mode
voltage of 200 mV, and a differential peak-to-peak amplitude
of 400 mV generated by an external pattern generator. In
addition, a printed circuit board (PCB) FR-4 10 inch channel
with a differential insertion loss of −13 dB at a frequency of 2.5
GHz was used in this measurement. Figure 12(a) shows an eye (a) (b)
Fig. 13. Results of programmable delay line (a) simulated and measured
diagram with an open width and height of 0.425 UI and 40 mV adjustable delay times (b) measured DNL and INL.
at the input node of the high-speed receiver. The measured
peak-to-peak time jitter is 115 ps. The CTLE of the proposed Figure 14 shows the measured results for the deserialized
high-speed receiver improved the time jitter to 50 ps by output data and the signal BSYNC, which is the byte
increasing the bandwidth and compensating for the insertion synchronization signal generated by detecting the SEED, at a
loss at a data rate of 5 Gbps/lane, as shown in Figs. 12(b) and data rate of 5 Gbps/lane. The measured data rate of the
(c). These results indicate that the proposed high-speed receiver deserialized 8-bit data and SEED is 625 Mbps. This
with the CTLE improves the signal integrity and supports a measurement was performed using a high-speed oscilloscope
bandwidth greater than the maximum speed of the MIPI DPHY and synchronizing with the signal BSYNC because of the speed
version 2.0 specification. limitation of the LVCMOS interface supported by the logic
Figure 13 shows the simulated and measured delay times, analyzer. The operation of 1-to-8 deserialization of the MIPI
differential nonlinearity (DNL), and integral nonlinearity (INL) D-PHY version 2.0 specification, including the detection of
for the programmable delay line. The three simulated results in SEED data 8'b10111000, was evaluated.
Fig. 13(a) indicate a monotonic increase in the adjustable delay Figure 15(a) shows the input pattern for evaluating the
time according to the sequential increase control of DSK [5:0] operation of the proposed auto-skew calibration. The
with PVT variations. For the case of Simulated data3, i.e., the high-speed input data of the data lane0 and lane1, i.e., D0INP/INM
worst case, the controllable time resolution is increased to a and D1INP/INM, have the same pattern with a time skew less than
maximum value of 26 ps, but this result is still allowed for skew 0.5 UI, based on the clock signal CLKINP/INM as a test pattern.
calibration at a data rate of 5 Gbps/lane. The measured total The test pattern contains a synchronous sequence pattern for the
delay time of the programmable delay line is approximately data alignment. In contrast, the high-speed input data of the
475 ps and the resolution of the fine and coarse delay lines are data lane2 and lane3, i.e., D2INP/INM and D3INP/INM, have the
approximately 7.7 ps and 60 ps, respectively. Because the same test pattern with a time skew greater than 0.5 UI. The
measured DNL and INL of the programmable delay line are – skew calibration scheme was evaluated using shmoo plots of
0.144/+0.305 LSB and –0.395/+0.063 LSB, respectively, as the error of the 32-bit output according to the time delay of the
shown in Fig. 13(b), they are appropriate to apply the proposed clock signal CLKINP/INM in the normal mode after the
auto-skew calibration. skew-calibration mode. Additionally, the evaluation was
16UI 212UI 00011101111000000101010111101111
D0/1INP 0 0 1
skew
11001100100110010011110010111101
D2/3INP 1 1 0 1
skew
CLKINP
Skew Calibration Mode Normal HS Mode

(a)
(b)
Byte synchronization
DO0OUT[7:0] 10111000 10111001 10101010 11011100 10110110 10011110 00000110 10111001

SEED
Fig. 14. Measured results of 1-to-8 deserialization outputs and BSYNC.
conducted at a data rate of 3 Gbps/lane owing to the speed

limitation of the external equipment. Figures 15(b) and (c)
(c)
show the measured results of the 1-to-8 deserialization
according to the skew calibration. In this measurement, the Data Correct Data Shift Data Error
expected data from the data lane0 to the data lane3 are 8’hB8, LANE0/1
LANE2/3
8’hB8, 8’h33, and 8’h33, respectively. As shown in Fig. 15(b),
-200 -150 -100 -50 0 50 100 150 200 250 300 350
the parallel data synchronized with the synchronous sequence CLK Delay (ps)
(8’hB8) is output normally in the cases of the data lane 0 and (d)
lane1. However, the parallel data of the data lane2 and lane3 Data Correct Data Shift Data Error
include error data, as the high-speed input data of the data lane2 LANE0/1
and lane3 have a test pattern with a time skew greater than 0.5 LANE2/3
UI. The measured results in Fig. 15(c) indicate that the 1-to-8 -200 -150 -100 -50 0 50 100
CLK Delay (ps)
150 200 250 300 350
deserialization was successfully performed using the proposed (e)

auto-skew calibration. Figure 15(d) shows the shmoo plot of Fig. 15. Measured results of skew calibration at 3 Gbps/lane (a) input pattern
the 32-bit output obtained via the skew control according to the (b) 32-bit output data w/o skew calibration (c) 32-bit output signal w/ skew
calibration (d) shmoo plot according to clock delay w/o skew calibration (e)
time delay of CLKINP/INM. The time skew among the input data shmoo plot according to clock delay w/ skew calibration.
is approximately 30 ps and that between the four data and the
TABLE I
clock is approximately 150 ps. These time skews are calibrated SUMMARY OF MAIN CHARACTERISTICS OF PROPOSED RECEIVER BRIDGE CHIP
to less than 10 ps, as shown in Fig. 15(e).
MIPI D-PHY
Table I shows the main characteristics of the proposed Items This work
version 2.0
receiver bridge chip based on the measurement results. The
Maximum data rate 4.5 Gbps 5 Gbps
proposed receiver bridge chip has a data rate of 5 Gbps/lane and
supports a bandwidth of 20 Gbps using four data lanes. The Minimum receivable eye width 0.5 UI 0.425 UI
requirements for input signals that can be received in the input signal eye height 80 mV 40 mV
receiver bridge chip were reduced by performing equalization Maximum differential −11 dB −13 dB
on differential insertion loss greater than the differential insertion loss of channel @ 2.5 GHz @2.5 GHz
insertion loss presented in MIPI D-PHY version 2.0 Deserialization including
specification. Furthermore, the proposed auto-skew calibration Yes Yes
byte synchronization
met the MIPI D-PHY version 2.0 specification.
Skew calibration Yes Yes
Figure 16 presents a FPGA-based frame grabber supporting
the MIPI CSI-2 using the MIPI D-PHY version 2.0 Clock skew min. 0.4 UI min. 0.46 UI
specification. This FPGA-based frame grabber supports a total after skew calibration max. 0.6 UI max. 0.54 UI
bandwidth of 20 Gbps by using the proposed four-lane receiver
bridge chip. For the demonstration of high-resolution image frame rate of 60 Hz, the FPGA-based frame grabber with the
data acquisition, a 10-bit Bayer color camera module with 24 proposed receiver bridge chip, and a personal computer (PC)
megapixels including horizontal and vertical blanks and a were serially connected. Figure 16 shows the experimental
receiver bridge chip with previous reports including the

high-speed receiver and skew calibration used in the parallel
interface. The proposed receiver bridge chip supports data rates
up to 5 Gbps/lane and performs auto-skew calibration for the
five lanes including the clock lane, in contrast to the previous
report [4]. By these characteristics, it fully meets the
specifications of MIPI D-PHY version 2.0 specification. The
proposed auto-skew calibration is implemented using a small
area of low power compared to other previous reports because
it uses the results of the 1-to-8 deserializer for phase detection.
The skew calibration reported in a previous study [8] was
performed with fine resolution using analog control method.
However, the analog skew calibration method increased its
power consumption and area. The proposed auto-skew
calibration provides wider skew-calibration range and mode
Fig. 16. Demonstration of FPGA-based frame grabber using proposed receiver precise resolution, while using low power and small area
bridge chip. similar to the results of the published report [9] for application
to die-to-die interfaces using through-silicon vias (TSVs). The
results for the demonstration of high-resolution image data skew-calibration range of 475 ps implemented in the proposed
acquisition. The MIPI CSI-2 interface between the camera auto-skew calibration is also suitable for a data rate of 2.5
module and the proposed receiver bridge chip and the parallel Gbps/lane where the skew calibration of the MIPI D-PHY
interface between the proposed receiver bridge chip and the version 2.0 specification should be applied. The 90° phase
FPGA were evaluated by checking the image displayed on the shifter in the prior literature [11] can finely adjust the skew of
PC monitor. The implemented MIPI D-PHY receiver bridge the source synchronous clock, but increased the hardware. The
chip was operated at a data rate of 5 Gbps/lane and supported a proposed auto-skew calibration minimizes the hardware for
total bandwidth of 20 Gbps using four data lanes. skew calibration of the source synchronous clock by setting the
Table II compares the characteristics of the proposed clock delay to half of the scanned 1-UI data.
TABLE II
PERFORMANCE COMPARISON OF RECEIVER AND SKEW CALIBRATION FOR HIGH-SPEED INTERFACE
Reference Lee et al. [4] Zheng et al. [8] Ahn et al. [9] Moon et al. [11] This work
MIPI D-PHY Parallel DRAM DRAM MIPI D-PHY
Application
version 1.2 interface die-to-die interface interface version 2.0
Signaling type Differential Differential Single-ended Single-ended Differential
Technology 0.11 µm CMOS 0.13 µm CMOS 65 nm CMOS 20 nm CMOS 0.11 µm CMOS
Supply 1.2 V 1.2 V and 1.8 V 1.2 V 1.0 V 1.2 V
Data rate 3 Gbps/lane 5 Gbps/lane 3 Gbps/pin 3.2 Gbps/pin 5 Gbps/lane
Peak-to-peak time jitter 35 @ 3 Gbps 50 @ 3 Gbps 58.89 @ 3.2 Gbps 50 @ 5 Gbps
68.5 @ 5 Gbps
of receiver [ps] (FR-4 2 inch) (TSV) (FR-4 - PLL jitter) (FR-4 10 inch)
data & clock DLL based DLL based DLL based data & clock
Skew calibration type
delay control data delay control data delay control clock delay control delay control
(PD: phase detection)
(external control) (PD: 1 bit) (PD: 1 bit) (PD: 1 bit) (PD: 8 bit or 32 bit)
range 361.8 ps (1 UI) 200 ps (1 UI) 300 ps - 475 ps (2 UI)
Skew control
resolution 11.7 ps fine (analog control) 12 ps 5 ps (estimated) 7.7 ps
Calibration for source clock delay self calibration self caliration
N/A N/A
synchronous clock external control (90° phas shift) (1-UI scan)
2
Area of total RX [mm ] 0.285 0.038 1.28 18 0.319
2 (0.065) (0.019) (0.010) (0.818) (0.069)
(area/lane [mm /lane])
Area of skew cal. [mm2] 0.027 0.64 1.206 0.026
2
-
(area/lane [mm /lane]) (0.013) (about 0.005) (about 0.055) (0.005)
Power of total RX
7.2 2.25 2.36 - 5.6
(mW/Gbps/lane)
Power of skew cal.
- 0.45 0.171 - 0.16
(mW/Gbps/lane)
IV. CONCLUSION [12] 5 Channel FPGA to MIPI D-PHY Bridge IC MC20902 datasheet, version
1.07, Aug. 2016.
The 20-Gbps receiver bridge chip supporting the MIPI [13] MIPI D-PHY Bandwidth Matrix Table User Guide, version 1.0, June.
D-PHY version 2.0 specification and including a high-speed 2015.
[14] MIPI Alliance Standard for Camera Serial Interface 2 (CSI-2), version
receiver and an auto-skew calibration circuit was implemented 1.2, Sep. 2014.
in a 0.11 μm CMOS process with a 1.2 V supply. It was used in [15] B. Casper, A. Martin, J. E. Jaussi, J. Kennedy, and R. Mooney, “An
the FPGA-based frame grabber to evaluate high-performance 8-Gb/s Simultaneous Bidirectional Link With On-Die Waveform
Capture,” IEEE J. Solid-State Circuits, vol. 38, no. 12, pp. 2111–2120,
mobile camera modules used in mass-market electronics. The
Dec. 2003.
proposed auto-skew calibration compensates for the time skew [16] J. Lee, K. S. Kundert, and B. Razavi, “Analysis and Modeling of
of the data and clock through the skew-calibration mode Bang-Bang Clock and Data Recovery Circuits,” IEEE J. Solid-State
according to the MIPI D-PHY version 2.0 specification. Circuits, vol. 39, no. 9, pp. 1571–1580, Sep. 2004.
[17] Y. Lee and I.-C. Park, “Single-step glitch-free NAND-based digitally
Additionally, the proposed high-speed receiver with CTLE controlled delay lines using dual loops,” IET Electronics Letters, vol. 50,
compensates for insertion loss by ensuring a bandwidth above no. 13, pp. 930–932, Jun. 2014.
2.5 GHz. Furthermore, the demonstration of the frame grabber [18] J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally, and M.
Horowitz, “A 14-mW 6.25-Gb/s Transceiver in 90-nm CMOS,” IEEE J.
using the proposed receiver bridge chip with a bandwidth of 20 Solid-State Circuits, vol. 42, no. 12, pp. 2745–2757, Dec. 2007.
Gbps revealed that the serial-to-deserialization interface
between the MIPI CSI-2 of the camera module and the FPGA Pil-Ho Lee was born in Daegu, South
satisfied all protocols of the MIPI D-PHY. Korea, in 1986. He received B.S., M.S.,
and Ph. D. degrees from the Department of
ACKNOWLEDGMENT Electronic Engineering, Kumoh National
The authors would like to thank DAQ SYSTEM Co. LTD., Institute of Technology, Gumi, South
Korea, for helping demonstrate the FPGA-based frame grabber. Korea, in 2012, 2014, and 2019,
The CAD tools were provided by IC Design Education Center, respectively. In 2019, he joined the Test &
Package Center, Samsung Electronics Co.,
Korea.
Ltd., Asan, South Korea. His current
research interests include the design of high-speed interfaces
REFERENCES
such as transceiver and clock generator.
[1] MIPI Alliance Specification for D-PHY, version 2.0, Nov. 2015.
[2] E.-J. Kim, H.-Y. Park, C. K. Ahn, S.-I. Lim, and S. Kim, “Unified dual
mode physical layer for mobile CMOS image sensor interface,” IEEE Young-Chan Jang (M’15) was born in
Trans. Consum. Electron., vol. 56, no. 3, pp. 1196–1203, Aug. 2010. Daegu, South Korea, in 1976. He received
[3] G. Balamurugan, J. Kennedy, G. Banerjee, J. E. Jaussi, M. Mansuri, F. a B.S. degree from the Department of
O’Mahony, B. Casper, and R. Mooney, “A scalable 5–15 Gbps, 14–75 Electronic Engineering, Kyungpook
mW low-power I/O transceiver in 65 nm CMOS,” IEEE J. Solid-State
Circuits, vol. 43, no. 4, pp. 1010–1019, Apr. 2008. National University, Daegu, in 1999, and
[4] P.-H. Lee, H.-Y. Lee, Y.-W. Kim, H.-Y. Hong, and Y.-C. Jang, “A M.S. and Ph.D. degrees in electronic
10-Gbps receiver bridge chip with deserializer for FPGA-based frame engineering from the Pohang University
grabber supporting MIPI CSI-2,” IEEE Trans. Consum. Electron., vol. 63, of Science and Technology, Pohang,
no. 3, pp. 209–215, Aug. 2017.
[5] M. Bucher, R. T. Kollipara, B. Su, L. Gopalakrishnan, K. Prabhu, P. K. South Korea, in 2001 and 2005, respectively. From 2005 to
Venkatesan, K. Kaviani, B. Daly, B. W. F. Stonecypher, W. Dettloff, T. 2009, he was a Senior Engineer with the Memory Division,
Stone, F. Heaton, Y. Lu, C. Madden, S. Bangalore, J. C. Eble, N. M. Samsung Electronics, Hwasung, South Korea, focusing on
Nguten, and L. Luo, “A 6.4-Gb/s Near-Ground Single-Ended Transceiver high-speed interface circuit design and next-generation DRAM.
for Dual-Rank DIMM Memory Interface Systems,” IEEE J. Solid-State
Circuits, vol. 49, no. 1, pp. 127–139, Jan. 2014. In 2009, he joined the School of Electronic Engineering,
[6] E.-J. Kim, H.-Y. Park, C. K. Ahn, S.-I. Lim, and S. Kim, “Adaptive Skew Kumoh National Institute of Technology, Gumi, South Korea,
Control of Data-Strobe Encoding for Mobile Display Serial Transceiver,” as a Faculty Member, where he is currently a Professor. His
IEEE Trans. Consum. Electron., vol. 57, no. 1, pp. 14–18, Feb. 2011. current research interests include high-performance
[7] E. Yeung, and M. A. Horowitz, “A 2.4 Gb/s/pin Simultaneous
Bidirectional Parallel Link with Per-Pin Skew Compensation,” IEEE J. mixed-mode circuit design for very large scale integration
Solid-State Circuits, vol. 35, no. 1, pp. 1619–1628, Nov. 2000. systems such as high-performance signaling, clock generation,
[8] Y. Zheng, J. Liu, R. Payne, M. Morgan, and H. Lee, “A 5-Gb/s Automatic and analog-to-digital conversion.
Sub-Bit Between-Pair Skew Compensator for Parallel Data
Communications in 0.13-μm CMOS,” IEEE Trans. Very Large Scale
Integr. (VLSI) Systems, vol. 21, no. 12, pp. 2274–2285, Dec. 2013.
[9] K. Ahn and C. Yoo, “Skew cancellation technique for >256-Gbyte/s
high-bandwidth memory (HBM),” IET Electronics Letters, vol. 52, no. 13,
pp. 1155–1157, Jun. 2016.
[10] J.-W. Lee, H.-J. Kim, C.-S. Jeong, J.-J. Lee, and C. Yoo, “Skew
Compensation Technique for Source-Synchronous Parallel DRAM
Interface,” IEEE Trans. Very Large Scale Integr. (VLSI) Systems, vol. 21,
no. 11, pp. 2155–2159, Nov. 2013.
[11] J.-W. Moon, H.-S. Yoo, H. Choi, I.-W. Park, S.-Y. Kang, J.-B. Kim, H.
Chung, K. Kim, D.-H. Lee, K.-J. Song, S.-H. Hyun, I. Song, Y.-S. Sohn,
Y.-H. Cho, J.-H. Choi, K.-I. Park, and S.-J. Jang, “An Enhanced
Built-off-Test Transceiver with Wide-range, Self-calibration Engine for
3.2 Gb/s/pin DDR4 SDRAM,” IEEE Asian Solid-State Circuits
Conference, pp. 139–142, Nov. 2018.

Lee 2019

Uploaded by

Copyright:

Available Formats

You might also like

Lee 2019

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lee 2019

Uploaded by

Copyright:

Available Formats

This article has been accepted for publication in a future issue of this journal, but has not been

A 20-Gbps Receiver Bridge Chip with Auto-skew

 Rx D-PHY Specification Version

D-PHY D-PHY D-PHY

II. PROPOSED 5-GBPS/LANE RECEIVER BRIDGE CHIP 16'hFFFF 0 1 0 1 0 1 0 1 0 1 0 1

DA0 DA0 high

FINE DELAY LINE

FINE DELAY UNIT

COARSE DELAY LINE

Fig. 9. Simulated results of deglitch circuit.

control code of the coarse delay line in the skew-calibration

The power consumption of each data lane, including the

16UI 212UI 00011101111000000101010111101111

Skew Calibration Mode Normal HS Mode

DO0OUT[7:0] 10111000 10111001 10101010 11011100 10110110 10011110 00000110 10111001

conducted at a data rate of 3 Gbps/lane owing to the speed

deserialization was successfully performed using the proposed (e)

receiver bridge chip with previous reports including the

You might also like