4 Modulation Schemes

for Optical Wireless


Learning Outcomes:
■ Why is intensity modulation compulsory in LED-based communication systems?
Which constraints need to be considered?
■ Which single-carrier intensity modulation schemes are hardware-friendly?
■ What is special about color-domain modulation?
■ Why is multi-carrier modulation attractive?
■ What are the characteristics of code-division multiplexing?
■ How is superposition modulation defined?
■ Is camera-based communication possible with an ordinary smartphone?

4.1 Intensity Modulation and Direct

Detection (IM/DD)
In the electrical domain, the task of a digital modulator is to convert a data bit stream u
into an analog waveform s(t ), which is suitable for data transmission. In the case of opti-
cal data transmission, the modulator (MOD) is followed by an analog driver circuit, which
feeds the light source (LS). The light source converts the electrical signal into photons. At
the receiver side, the photons are converted back to the electrical domain by a photodetec-
tor (PD). Usually, the photocurrent is very weak, and therefore it is amplified and filtered
by a transimpedance amplifier (TIA). The TIA outputs the received signal r (t ) = v OUT (t ).
Finally, the data symbols are recovered in a demodulator (DEM). The estimated data bit
sequence is denoted as û. Fig. 4.1 shows a block diagram of the physical (PHY) layer. Chap-
ter 8 and 9 are devoted to optical devices and analog circuits, respectively. In this chapter,
focus is on digital modulation schemes. Many data transmission aspects are widely appli-
cable in all OWC applications. Additionally, modulation schemes employed in VLC should
consider the impact of the modulated light on the human eye. Flicker mitigation, dim-
66 4 Modulation Schemes for Optical Wireless Communications

ming control, color quality (including color temperature and high color rendering index)
etc. need to be considered in the design of VLC transmitters.

u s(t) LS PD r(t) û
MOD Driver Channel TIA DEM

Figure 4.1 Block diagram of the physical layer.

Concerning the light source, it is important to distinguish between lasers and LEDs. As ex-
plained in Chapter 8, in lasers a standing wave is generated, hence laser light is coherent
for a certain time period. With respect to modulation, this property can be exploited in a
favorable fashion: the data can be distributed on two orthogonal channels, namely the in-
phase component and the quadrature component. This doubles the capacity [Han12]. In
practice, complex-valued waveforms can be realized in conjunction with a Mach-Zehnder
modulator. The Mach-Zehnder modulator is an interferometer with an electro-optical
phase shifter in one of the two arms. It is the optical correspondence of the quadrature
modulator in RF communications.
LEDs are different in the sense that the light emitted by this type of light source is al-
ways noncoherent. As a consequence, the phase cannot be used for data transmission.
In other words, only real-valued waveforms are suitable in connection with LEDs. This
fact limits the bandwidth efficiency (in bit/symbol or bit/s/Hz). Since the phase cannot
be exploited for signaling, one must restrict to intensity modulation (IM). IM is a base-
band technique modulating the brightness of a light source. This causes an additional
constraint: the brightness is always non-negative by nature. Hence, the transmit signal
must be real-valued and non-negative. The latter fact reduces the power efficiency (in dB).
At the receiver side, the intensity fluctuations are commonly detected by a photodetector.
In the remainder, we assume that (i) the active area of the photodetector, A R , is much larger
than λ2 , and (ii) we do not try to recover the phase of the light wave (even if coherent laser
light is available). This detection principle is called direct detection (DD). DD is a nonco-
herent detection strategy. Here, a third problem is waiting: noncoherent detection is less
noise immune compared to coherent detection. We focus our subsequent presentation on
modulation schemes suitable for IM/DD, but also consider techniques utilizing a camera
at the receiver side.
An impressive number of different modulation schemes exists that are suitable for IM/DD
[Isl16, Ran10]. Many of these schemes belong to the class of linear modulation schemes.
Linear modulation schemes can be written in the form [Nyq28]
s(t ) = x[k] · g Tx (t − kT ), (4.1)

where s(t ) is the transmit signal, k the time index, x[k] the k-th data symbol, g Tx (t ) the
baseband pulse, T the symbol period, and 1/T the symbol rate. The sequence of data
symbols x[k] is a function of the data bit sequence. If the impulse response g Tx (t ) does
not exceed T , the parameter T is called symbol duration. Since we restrict ourselves to
intensity modulation, s(t ), x[k], and g Tx (t ) are real-valued. In the case of laser transmis-
sion, however, these three terms may be complex-valued. The number of different data
symbols is called the cardinality of the symbol alphabet and is denoted as Q. Every dis-
tinct data symbol can be addressed by log2 Q data bits. In other words, log2 Q bits can be
4.1 Intensity Modulation and Direct Detection (IM/DD) 67

transmitted per data symbol. Therefore, high-order modulation schemes (Q > 2) are more
bandwidth efficient than binary modulation schemes (Q = 2). In the uncoded case, one
data bit corresponds to one info bit, whereas in the presence of channel coding one data
bit corresponds to one code bit. The allocation of the data symbols in the one-dimensional
(or two-dimensional) space is called symbol constellation.
Linear modulation schemes are completely characterized by mapping, labeling, and pulse

■ Mapping refers to as the allocation of the Q data symbols in the symbol constellation.
■ The assignment of log2 Q data bits u[k] := [u 1 [k], u 2 [k], . . . , u log2 Q [k]] at time index k
onto the Q data symbols x[k] is called labeling.
■ The pulse shaping, g Tx (t ), finally converts the discrete-time data symbols into an
analog waveform.

In the remainder, special cases of practical interest will be discussed concerning mapping,
labeling, and pulse shaping, but also modulation schemes which cannot be described by
(4.1), known as nonlinear modulation schemes. In order to enable a fair comparison be-
tween different modulation schemes, power normalization is necessary. Most common is
an average power normalization:

2 1 2
E {x [k]} = 1 and g Tx (t ) d t = 1. (4.2)

Sometimes, different modulation schemes are compared under a peak power constraint
instead, because light sources are peak power constrained.
If not explicitly mentioned differently, a rectangular pulse of duration T are assumed:
1 for 0 ≤ t ≤ T
g Tx (t ) = (4.3)
0 else.

Transmitting a square-wave signal requires a large bandwidth and therefore is not spec-
trally efficient. In order to overcome this problem, one may use a raised-cosine pulse

RC sin(πt /T ) cos(r πt /T )
g Tx (t ) = (4.4)
| πt{z
/T } 1 − (2r t /T )2
sinc(πt /T )

or a root-raised-cosine pulse

RRC (4r t /T ) cos(π(1 + r )t /T ) + sin(π(1 − r )t /T )

g Tx (t ) = (4.5)
(πt /T )[1 − (4r t /T )2 ]

instead, where r is the so-called roll-off factor, which determines the excess bandwidth
compared to a sinc pulse (raised-cosine pulse with r = 0). The double-sided bandwidth
is B = (1 + r )/T for both type of pulses. Raised-cosine pulses (with wideband filtering at
the receiver side) and root-raised-cosine pulses (with root-raised-cosine filtering at the re-
ceiver side since g Tx (t ) ∗ g Tx (t ) = g Tx (t ), where the asterisk denotes the convolution
operation) are popular in non-dispersive channels as they do not cause any intersymbol
68 4 Modulation Schemes for Optical Wireless Communications

interference (ISI), like rectangular pulses. However, this family of pulses creates another
problem: negative amplitudes occur in the time domain, which need to be avoided in the
IM/DD scenario of interest. Correspondingly, a DC bias must be added to the modulated
waveform. Unfortunately this DC bias reduces the signal-to-noise ratio (SNR). The smaller
the value of r , the larger is the necessary bias. In the worst case scenario of sinc pulses, the
SNR is reduced by 0.83 dB [Nos16]. In dispersive channels, Gaussian-like pulses are a good
alternative because they are non-negative.
Recall that r (t ) denotes the TIA output signal. Throughout this chapter, an additive white
Gaussian noise channel is assumed in numerical results. At the receiver side, an ideal
matched filter (MF) is assumed, i.e.,

g Rx (t ) = g Tx (T − t ). (4.6)

After matched filtering

y(t ) = r (τ) · g Rx (t − τ) d τ, (4.7)

symbol rate-sampling is performed, i.e., one sample is taken per symbol period T . The
matched filter receiver maximizes the SNR at the output the sampler. The k-th MF output
sample is denoted as y[k] according to the equivalent discrete-time channel model intro-
duced in the previous chapter. Lastly, data detection is conducted. A so-called maximum-
likelihood (ML) receiver estimates the most likely data symbol
¯ ¯2
¯ X L ¯
x̂[k] = arg min ¯ y[k] − h l x̃[k − l ]¯ , (4.8)
¯ ¯
x̃[k] ¯ l =0

where x̃[k] are the Q hypotheses of the transmitted data symbol x[k]. The relationship
between the log2 Q estimated data bits, û[k], and x̂[k] is ambiguity-free by undoing the
In the numerical results throughout this chapter, unipolar transmission via an additive
white Gaussian noise (AWGN) channel is assumed: y[k] = x[k] + n[k]. Given E {x 2 [k]} = 1,
the variance per noise sample is σ2n = 1/(2E s /N0 ).

4.2 Constraints and Performance Criteria

Upon designing a modulation scheme suitable for optical wireless communications, a va-
riety of constraints need to be taken into account. The most basic one is that in inten-
sity modulation, the signal waveforms must be non-negative. In the case of LEDs, sig-
nal waveforms additionally must be real-valued, whereas in laser-diode-based transmitters
complex-valued modulations schemes are applicable.
Many constraints are application-dependent. Li-Fi systems, for example, should satisfy
illumination requirements with first priority, whereas data transmission aspects are of sec-
ondary priority. Color quality and safety issues such as flicker avoidance are very impor-
tant in this context, perhaps with dimming support. In other use cases, like underwater
4.3 Single-Carrier Modulation (SCM) 69

communications and optical backbone network systems, power efficiency is typically im-
portant. Visual aspects need not be considered at all in pure communication systems.
Although light spectrum is virtually unlimited, compared to RF communications, band-
width efficiency (also called spectral efficiency) is of increasing importance in many ap-
plications. This is particularly true when data rates beyond 1 Gbps are targeted. Exploit-
ing the color domain and the spatial domain simultaneously is the clue towards gigabit
services. Furthermore, multiuser communication aspects are becoming increasingly im-
portant. This does not just affect the MAC layer, but also the PHY layer and hence the
modulation scheme as well.
Computational complexity is an important subject, particularly for mass-market use cases.
Numerical complexity is often dominant at the receiver side, but it is triggered by the choice
of the modulation scheme. Besides computational complexity, hardware complexity is also
of great concern. Hardware-friendly modulation schemes do not need a digital-to-analog
converter (DAC) and allow the use of highly efficient drivers.
Well-designed modulation schemes are tailored to the limitations of the light source(s).
The main limitations of LEDs are limited peak power (called peak power constraint),
limited bandwidth, and inherently nonlinear input/output characteristic (in terms of
voltage-to-current (V2I) conversion and to some extend also current-to-optical power
(I2P) conversion). Two-level modulation schemes are robust against nonlinear effects. A
high modulation hub causes a high signal-to-noise ratio. Vice versa, modulation schemes
with continuous-valued waveform are frequently offset by a bias. This bias is helpful for
illumination purposes. A small modulation hub enhances switching speed, however at the
cost of signal-to-noise ratio.
LEDs frequently lack accurate characterizations because of their mass production. Some
modulation schemes are robust with respect to a wide range of different characteristics,
others are more sensitive.
Finally, the impact of the optical modulation scheme on localization aspects is scarcely ex-
plored. In the remainder, bit error rate and spectral efficiency are chosen to be the primary
performance criteria under investigation.

4.3 Single-Carrier Modulation (SCM)

The notion “single-carrier” modulation is borrowed from carrier-modulated radio sys-
tems. Since IM/DD transmission is performed at baseband, “single-carrier” modulation
is a misleading nomenclature, at least in a strict sense. Still, “single-carrier” modulation
is a popular terminology in light communication, in order to distinguish these techniques
from “multi-carrier” baseband techniques. “Carrierless modulation” is more appropriate,
but reserved for a specific technique introduced later. A possible workaround is to interpret
the peak wavelength as the modulated carrier.
Generally speaking, most SCM techniques are fairly easy to implement. Many SCM formats
are two-level modulation techniques, entirely avoiding problems associated with nonlin-
earities. Due to the bandwidth limitation of the light source(s) and in the presence of multi-
path, with increasing data rate the bit error rate performance deteriorates, however, unless
70 4 Modulation Schemes for Optical Wireless Communications

equalization is performed. Conceptually, equalization is simpler for multi-carrier modu-

lation techniques. Most SCM schemes discussed subsequently can also be used to mod-
ulate the subcarriers of multi-carrier modulation techniques. The bandwidth efficiency
of single-carrier modulation schemes is log2 Q bit/symbol. Together with a specific pulse
shape, the bandwidth efficiency can also be expressed in bit/s/Hz.

4.3.1 On-Off Keying (OOK)

Non-return-to-zero on-off keying (NRZ-OOK) is perhaps the most intuitive and simplest
modulation scheme suitable for light communications. Depending on the data bits, the LS
is either switched “on” for one symbol duration T , or “off” [Gag95]. Consequently, NRZ-
OOK has a binary symbol alphabet (Q = 2) with data symbols
© p ª
x[k] ∈ 0, + 2 . (4.9)
The factor 2 is valid for uniformly distributed random data symbols. The NRZ-OOK sym-
bol constellation is shown in Fig. 4.2. (For those readers familiar with binary phase shift
keying (BPSK), the following equivalent may be helpful: NRZ-OOK can be interpreted as
BPSK with DC offset. The DC offset sacrifices 50 % of the transmit power, but does not
contribute to data detection. Hence, the power efficiency of NRZ-OOK is worse by 3 dB.)

0 1

0 2 Re

Figure 4.2 NRZ-OOK symbol constellation. Data bit u[k] = 0 ispmapped onto symbol 0 (light
source “off”), whereas data bit u[k] = 1 is mapped onto symbol 2 (light source “on”). The dotted
line marks the decision threshold.

Data detection can be performed by threshold detection. The threshold needs to be

adapted in the case of a time-varying channel. In Fig. 4.3, the bit error rate (BER) of
NRZ-OOK is plotted versus the average signal-to-noise ratio. The BER is equal to
1 Es
P b = erfc , (4.10)
2 2N0

R∞ 2
where the complementary error function is defined as erfc(x) := p2 e −y d y. Note that
E s /N0 = E b /N0 for binary modulation, where E s /N0 is the SNR per symbol and E b /N0 is
the SNR per bit, respectively, both in the electrical domain.
In return-to-zero on-off keying (RZ-OOK) the “on” pulse has a duration shorter than T ,
i.e., it “returns to zero” during any symbol duration. RZ-OOK has a higher peak-to-average
power ratio and an increased bandwidth requirement.
4.3 Single-Carrier Modulation (SCM) 71







10 0 2 4 6 8 10 12 14
Es/N0 in dB

Figure 4.3 BER of NRZ-OOK vs. SNR.

4.3.2 Amplitude Shift Keying (ASK), PAM and QAM

In Q-ary unipolar amplitude shift keying (Q-ASK), information is mapped onto Q equally-
spaced amplitude levels

x[k] ∈ β, α + β, 2α + β, 3α + β, . . . , (Q − 1)α + β ,
© ª

where α is the Euclidean distance (“spacing”) between adjacent symbols and β is a non-
negative bias term. The reason for the bias term is that in high-speed optical applications
it may be favorable not to switch-off the LS entirely. Q-ASK is a generalization of NRZ-OOK.

Im Im
00 10 11 01 00 10 11 01
q q q
0 2
7 2 2
7 3 2
7 Re √1

Figure 4.4 Unipolar 4-ASK symbol

p constellation with equal spacing. (Left-hand side: zero bias,
right-hand side: β = α/2 = 1/ 21.) The decision thresholds are marked by dotted lines.

In the special case of β = 0, the equidistant spacing can be calculated as

α= . (4.12)
1 + 4 + 9 + · · · + (Q − 1)2
p p p p
For 2-ASK, 4-ASK, 8-ASK, and 16-ASK, the spacing is 2, 2/7, 2/35, and 2/155, respec-
tively. In the special case of β = α/2, the spacing can be written as
α= . (4.13)
Q/4 + (1 + 2 + 3 + · · · +Q − 1) + (1 + 4 + 9 + · · · + (Q − 1)2 )
72 4 Modulation Schemes for Optical Wireless Communications

p p p p
For 2-ASK, 4-ASK, 8-ASK, and 16-ASK, the spacing is 2/ 5, 2/ 21, 2/ 85, and 2/ 341, re-
spectively. Fig. 4.4 illustrates the 4-ASK symbol constellation for the case of Gray labeling.
In Gray labeling, adjacent data symbols differ only in a single bit. In the uncoded case,
Gray labeling has a positive effect on the bit error performance, because the detection of
the neighboring symbol causes only a single info bit to be wrong. Upon comparing the last
two formulas we recognize that increasing the bias decreases the Euclidean distance. On
the one hand, this degrades the bit error performance, because the minimum Euclidean
distance determines the asymptotic error rate. On the other hand, a smaller hub is favor-
able in high-speed applications.

0 0
10 10
-1 4-ASK 10
-1 4-ASK
16-ASK 16-ASK
-2 -2
10 10

-3 -3
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 0 4 8 12 16 20 24 28 32 10 0 4 8 12 16 20 24 28 32 36
Eb/N0 in dB Es/N0 in dB

Figure 4.5 BER of ASK vs. SNR per bit (left) and vs. SNR per symbol (right).

Given β = 0, the BER of 2-ASK, 4-ASK, and 8-ASK is plotted in Fig. 4.5 as a function of
the SNR per bit and per Q-ary symbol, respectively, where E s /N0 = (log2 Q) E b /N0 . At the
receiver side, a maximum-likelihood (ML) data detector is implemented. ML detection
implies that, given Q hypotheses, the data symbol is chosen which has the smallest squared
Euclidean distance with respect to the observation of the receiver. The BER performances
of 2-ASK and NRZ-OOK are identical. For low BERs, the loss of Q-ary ASK in terms of E s /N0
is 10 log10 ((α2−ASK /αQ−ASK )2 ) dB. For example, 4-ASK has an asymptotic gap of 8.45 dB,
8-ASK is worse by 15.4 dB, and 16-ASK suffers from a loss of 21.9 dB.
As defined before, Q-ASK is a unipolar modulation scheme, characterized by discrete am-
plitude levels. A generalization is Q-ary bipolar pulse amplitude modulation. In Q-PAM,
information is mapped onto Q/2 equidistant amplitude levels and the phases 0 rad and
π rad, respectively:

x[k] ∈ ± 1/ αQ , ±3/ αQ , ±5/ αQ , . . . , ±(Q − 1)/ αQ ,

© p p p p ª

where αQ := (Q/2)2 + αQ/2 and α2 := 1 (e.g., α2 = 1, α4 = 5, α8 = 21, α16 = 85, etc.). 2-PAM
and BPSK are identical modulation schemes.
A further generalization is Q-ary square quadrature amplitude modulation. Q-QAM is a
two-dimensional modulation scheme. Each quadrature component employs bipolar Q-
ary PAM:
½ q q q ¾
±1/ 2αpQ , ±3/ 2αpQ , . . . , ±( Q − 1)/ 2αpQ
Re{x[k]} ∈
½ q q q ¾
±1/ 2αpQ , ±3/ 2αpQ , . . . , ±( Q − 1)/ 2αpQ ,
Im{x[k]} ∈ (4.15)
4.3 Single-Carrier Modulation (SCM) 73

where αQ is chosen according to PAM. The factor of two accounts for power normalization
in two signal dimensions. Square QAM has a favorable power/bandwidth efficiency.
The reason why we introduce bipolar PAM and complex-valued QAM is not obvious in
conjunction with single-carrier intensity modulation schemes. However, if we move on to
multi-band and to multi-carrier schemes, PAM and QAM become indeed useful in IM/DD

4.3.3 Pulse Width Modulation (PWM)

Pulse width modulation (PWM) is a technique which converts a real-valued unipolar sig-
nal s(t ) with 0 ≤ s(t ) ≤ s max into a sequence of rectangular pulses. The amplitude and rate
of these pulses is constant, whereas their width is proportional to the instantaneous am-
plitude of s(t ). If s(t ) = 0, the duty cycle is 0 %. Vice versa, if s(t ) = s max , the duty cycle is
100 %.
There are different possibilities to implement PWM modulators. A simple method is a com-
parator, e.g. an ideal operational amplifier. The signal s(t ) is connected to the positive input
of the differential amplifier, the negative input to a sawtooth signal with linear slope, con-
stant peak amplitude s max , and constant frequency. The modulated signal is available at
the output of the comparator.
In order to modulate digital data, the principle just described can be inverted. Given a bit
tuple of length b, 2b different pulse widths (i.e., 2b different duty cycles) can be addressed.
In other words, digital data can be converted into a PWM signal in a hardware-friendly
fashion. Many microcontrollers include PWM modules or timers in order to generate PWM
Demodulation can be done by means of a lowpass filter. In order to recover b info bits,
it is necessary to perform averaging over exactly one symbol duration, where the symbol
duration corresponds to one period of the sawtooth signal.
PWM is rarely used for optical data transmission, however, frequently used for dimming
support in illumination applications. Controlling the duty cycle is more hardware-friendly
than adjusting the forward current of LEDs and laser diodes directly.

4.3.4 Pulse Position Modulation (PPM)

In Q-ary pulse position modulation (Q-PPM), log2 Q data bits are encoded by sending a
single pulse in one out of Q time slots per symbol duration T [Gag95]. Consequently, the
active time slot is data-dependent. The corresponding baseband pulse g Tx (t ) is called chip.
Usually, all time slots have the same spacing Tc = T /Q. PPM is a nonlinear modulation
scheme, i.e., it cannot be represented by (4.1). A Q-ary PPM transmit signal has the form
X ¡ ¢
s(t ) = g Tx t − kT − u[k] Tc , (4.16)

where u[k] ∈ {0, 1, . . . ,Q −1}. If u[k] = 0, the baseband pulse is transmitted without delay, for
u[k] = 1 it is delayed by Tc , etc. Together with rectangular pulse shaping the amplitude of
74 4 Modulation Schemes for Optical Wireless Communications

each chip is Q. Consequently, high-order Q-PPM has a large crest factor. For rectangular
pulses of duration Tc = T /Q, the Q possible waveforms are orthogonal. Fig. 4.6 depicts a
4-PPM waveform. The required bandwidth is proportional to Q. This is perhaps the main
drawback of Q-ary PPM, besides a possible peak power constraint.

√ 00 01 10 11

0 T 2T 3T 4T t
Figure 4.6 Example of a 4-PPM transmit signal employing Tc = T /4. In orthogonal signaling, the
labeling does not have an impact on the error rate.

2-PPM and NRZ-OOK have identical bit error rate performance, but 2-PPM occupies twice
as much bandwidth given the same type of baseband pulse. If the chips are orthogonal, the
power efficiency (in terms of E b /N0 ) of Q-ary PPM improves with increasing cardinality Q.
The larger Q, the larger is the amplitude Q. Therefore, with increasing cardinality Q-PPM
gets more immune against noise, cf. Fig. 4.7. This effect is further improved in practice,
because the peak pulsing current of LEDs and laser diodes is typically larger than the DC
forward current. For equal energy orthogonal signals, according to [Pro08] the BER on the
AWGN channel is
Z∞ h
 Ã s !2 
Q/2 1 ¡ ¢Q−1 i 1 2E s
Pb = p 1 − 1 − Q(y) · exp − y −  d y, (4.17)
Q − 1 2π 2 N0
³ ´
where the Q-function is related to the complementary error function as Q(x) = 12 erfc px
and E s /N0 = (log2 Q) E b /N0 . The energies per symbol, E s , and per chip, E c , are the same.

0 0
10 10
-1 4-PPM 10
-1 4-PPM
16-PPM 16-PPM
-2 32-PPM -2 32-PPM
10 64-PPM 10 64-PPM
128-PPM 128-PPM


-3 256-PPM -3 256-PPM
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 10 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Eb/N0 in dB Es/N0 in dB

Figure 4.7 BER of PPM vs. SNR per bit (left) and vs. SNR per symbol (right).

If the pulses are transmitted in regular intervals, PPM causes spectral lines. These spectral
lines may be useful for clock recovery, but are otherwise undesirable. With pseudo-random
chip durations Tc around the nominal value of T /Q, spectral lines can be avoided.
PPM can be generalized in different directions. Differential PPM (DPPM) has been pro-
posed in order to achieve power and/or bandwidth efficiency improvements [Shi99].
4.3 Single-Carrier Modulation (SCM) 75

However, unequal symbol durations affect the illumination performance. A possible

workaround to this problem has been revealed in [Del10]. PPM can also be generalized in
the sense that multiple chips are activated simultaneously in order to increase the spectral
efficiency when the peak power is constrained. This technique is referred to as multipulse
PPM (MPPM) [Wil05b]. The advantages of Q-ary PAM (bandwidth efficiency) and PPM
(power efficiency) are combined in multiple pulse amplitude and position modulation
(MPAPM) [Zen15].

4.3.5 Variable Pulse Position Modulation (VPPM)

Q-ary variable pulse position modulation (Q-VPPM) is another generalization of Q-PPM.

Q-VPPM is a combination of Q-PPM with variable pulse widths (PWM). Hence, Q-VPPM
supports dimming. Fig. 4.8 illustrates 2-VPPM for different dimming levels, assuming that
the data sequence [001] is transmitted. For a dimming level of 50 %, 2-PPM and 2-VPPM
are identical. VPPM has been incorporated in the IEEE 802.15.7 VLC standard [IEEE802].

0 0 1



0 T 2T 3T t
Figure 4.8 2-VPPM with different dimming levels.

An alternative method to provide dimming support is MPPM. In [Lee11], VPPM has been
compared with MPPM for this purpose. A higher bandwidth efficiency has been reported
compared to 2-VPPM.

4.3.6 Carrierless Amplitude and Phase Modulation (CAP)

Carrierless amplitude and phase modulation (CAP) is a substitute of quadrature modula-

tion. In the area of optical communications, CAP is currently becoming increasingly pop-
ular, because it is an alternative to multi-carrier modulation. We will compare CAP with
quadrature modulation first, before a system design suitable for OWC is presented.
In the top part of Fig. 4.9 a carrier-modulated transmission scheme and in the bottom part
the corresponding carrierless variant is illustrated. In carrier-modulated linear modulation
schemes, a time-invariant lowpass filter is used for pulse shaping. Its impulse response
g Tx (t ) is assumed to be real-valued and bandlimited. Let B /2 denote the single-sided cut-
off frequency. (Recall for root-raised-cosine pulses B = (1 + r )/T .) After pulse shaping,
76 4 Modulation Schemes for Optical Wireless Communications

quadrature modulation is performed. The carrier frequency is denoted as f 0 . The small-

est meaningful carrier frequency is f 0 = B /2. It is important to mention that the transmit
signal s BP (t ) is real-valued. A quadrature demodulator recovers the two quadrature compo-

nents. The optimal receive filter is a matched filter g Rx (t ) ∼ g Tx (−t ) followed by symbol-rate
sampling at rate 1/T with optimized sampling phase. Note that (.)∗ denotes the complex
conjugate, i.e., (a + j b)∗ = a − j b. In the noiseless case, the matched filter output samples
y[k] := y Re [k]+ j y Im [k] are identical with the data symbols x[k] := x Re [k]+ j x Im [k], if g Tx (t )
is a root-raised-cosine pulse and if clock synchronization is perfect. Non-perfect clock syn-
chronization does not cause any interference between the quadrature components.
√ √
2 cos(2πfot) 2 cos(2πfot)

xRe[k] LP LP T yRe[k]
gTx(t) gRx(t)

xIm[k] LP LP T yIm[k]
gTx(t) gRx(t)

√ √
− 2 sin(2πfot) − 2 sin(2πfot)

xRe[k] BP BP T yRe[k]
√ √
2gTx(t) cos(.) 2gRx(t) cos(.)
xIm[k] BP BP T yIm[k]
√ √
− 2gTx(t) sin(.) − 2gRx(t) sin(.)

Figure 4.9 Carrier-modulated transmission scheme (top) and carrierless modulation scheme

In the CAP technique, lowpass (LP) filtering is substituted by bandpass (BP) filtering.
Quadrature modulation/demodulation is obsolete. In order to deal with complex-valued
Q-QAM data symbols x[k], in real and imaginary branch different time-invariant filters are
implemented, whose real-valued impulse responses g Re (t ) and g Im (t ) are orthogonal, i.e.,
1 !
g Re (t ) · g Im (t ) d t = 0. (4.18)

This orthogonality constraint is fulfilled for impulse responses with equal amplitude char-
acteristic, which are phase-shifted by π/2. A preferable solution is
p p
g Re (t ) = 2 g Tx (t ) cos(2π f 0 t ), g Im (t ) = − 2 g Tx (t ) sin(2π f 0 t ), (4.19)

as depicted in the lower part of Fig. 4.9. Note that cross-talk between the quadrature
components diminishes only in the case of perfect clock synchronization, as opposed to
4.3 Single-Carrier Modulation (SCM) 77

quadrature modulation. Furthermore, in CAP f 0 is a baseband frequency (e.g., f 0 = B /2).

Thus, CAP can be generated fully digitally.
This fundamental concept can easily be generalized towards multi-band transmission. Let
us assume M equidistant subbands and let f m be the carrier frequency of the m-th sub-
band, m ∈ {1, . . . , M }:
M X¡
2 X
s CAP (t ) = x Re, m [k] g Tx (t − kTu ) cos(2π f m (t − kTu ))
M m=1 k
−x Im, m [k] g Tx (t − kTu ) sin(2π f m (t − kTu )) , (4.20)
where Tu := M T . If f m = (2m − 1)B /2, the frequency band is between 0 Hz and M B Hz,
see Fig. 4.10. For root-raised-cosine pulses, B = (1 + r )/Tu = (1 + r )/(T M ), i.e., the total
bandwidth is (1 + r )/T . This total bandwidth should be matched to the bandwidth of the
light source (and the coherence bandwidth of the physical channel in the presence of mul-
tipath; the coherence bandwidth considers time-variations [Pro08]). Bandwidth efficiency
is (log2 Q)/(1 + r ) bit/s/Hz if all subbands are modulated with the same scheme. Strictly
speaking, CAP is not carrierless, but a baseband I/Q mixing technique. Since B (and there-
fore the entire set of carrier frequencies f m ) is related to T without any ambiguity, no fre-
quency synchronization is necessary. This avoids additional receiver complexity (but also
phase jitter and frequency offset in coherent systems).

Φss(f )

f1 f2 f3 f4 f5 f
Figure 4.10 Power spectral density of CAP modulation (M = 5 for illustrative purpose).

In the area of VLC, CAP modulation has been investigated in several recent papers [Wu13,
Olm14, Hai15, Wan15b]. In order to adopt CAP to the frequency response of the LEDs
and the dispersive channel, the spectrum is divided into M equally spaced subbands. The
quadrature components of the Q-ary QAM symbols are passed through real and imaginary
transmit filters with impulse responses that form a Hilbert pair. Towards this goal, finite
impulse response (FIR) filtering is performed in the mentioned papers. The real and imag-
inary components are linearly superimposed prior to transmission according to (4.20). A
negative transmit signal can be avoided by adding a positive DC bias. A high-resolution
DAC is needed. Sufficiently long filter lengths are necessary in order to avoid cross-talk be-
tween adjacent subbands, i.e., to avoid adjacent channel interference (ACI). The required
sampling frequency and number of samples per symbol has been derived in [Olm14]. Al-
though CAP is computationally complex, it offers high spectral efficiencies in bandlimited
IM/DD channels. Given a roll-off factor r = 0.1, bandwidth efficiencies of up to about
10 bit/s/Hz at 30 dB SNR have been predicted in [Hai15], if M is sufficiently large (M ≥ 20)
and if adaptive bit allocation and power loading is performed. (The optimum solution
is the so-called water-filling method, which is introduced in conjunction with orthogonal
78 4 Modulation Schemes for Optical Wireless Communications

frequency-division multiplexing in Section 4.5.1.) Consequently, CAP is an alternative to

multi-carrier modulation schemes. As opposed to OFDM and DMT (to be discussed in
Section 4.5), CAP is an orthogonal design with no spectral overlap between the subbands
(OFDM nomenclature: subcarriers). Therefore, CAP may be more robust with respect to
nonlinearities. This is a possible topic for future research. Finally, it should be mentioned
that CAP is very similar to wavelength-division multiplexing (WDM), the optical equiva-
lence of frequency-division multiplexing (FDM).

4.4 Color-Domain Modulation

The emitted light quality of phosphor-converted white LEDs is dictated by the average for-
ward current, rather than the instantaneous drive current, provided that an adequate ther-
mal management is used [Pop16]. Consequently, modulation schemes producing a data-
independent average transmit power, so-called DC-balanced modulation schemes, do not
significantly alter light quality metrics. Any modulation scheme already being discussed
fulfills the DC constraint, if employed together with sufficient scrambling (in order to ran-
domize the data stream) or line coding (in order to avoid long runs of zeros and ones).
Given a multi-channel luminary, in the simplest case an RGB LED, color-domain modula-
tion schemes are perhaps a better solution. This family of modulation schemes is able to
exploit the additional degree of freedom of color space and to control color quality metrics
more directly. Besides these advantages, RGB LEDs offer a bandwidth of about 20 MHz per
color (i.e., approximately 60 MHz in total), which is approximately a thirty-fold improve-
ment compared to a sole white LED.

4.4.1 Color Shift Keying (CSK)

In Q-ary color shift keying (CSK), data is mapped onto distinct color coordinates [IEEE802,
Mon14b]. Given Q different x y coordinates, log2 Q bits can be transmitted per time index k.
Q is not bounded by the number of light sources, M . Conventionally, a single tri-chromatic
LED set is assumed (M = 3). This tri-chromatic set, defined by the peak wavelengths of
M = 3 primary colors, forms a triangular gamut in the CIE 1931 xy chromaticity diagram.
The idea now is to define Q coordinates x y within the gamut, subject to the constraint that
the minimum Euclidean distance between all possible pairs of coordinates is as large as
possible. For the case of Q = 4 (4-CSK), the result is simple: three of the four coordinates
are the vertices of the triangle. The fourth coordinate is chosen to be the centroid of the
triangle, see the left-hand side of Fig. 4.11. Each of these Q = 4 coordinates represents a
data symbol. Since Gray labeling is impossible, labeling is almost arbitrary. Two bits can be
transmitted per symbol.
On the right-hand side of Fig. 4.11, an example of a 16-CSK constellation (Q = 16) is de-
picted. The main construction principle is to split the original triangle specified by the
three primaries into smaller triangles. The symbol coordinates are either determined by
the vertices of these smaller triangles, or their centroids. Four bits can be transmitted per
4.4 Color-Domain Modulation 79

Primary 2 Primary 2
[00] [0000]

[0011] [0101]

[0010] [0111]
[1010] [0100]
[01] [0110]

[1011] [1110] [1101]

[10] [11] [1001] [1111] [1100] [1000]

Primary 3 Primary 1 Primary 3 Primary 1

Figure 4.11 Constellation diagram of 4-CSK (left) and 16-CSK (right) defined by IEEE 802.15.7.

In IEEE 802.15.7, the cases of Q = 4, Q = 8, and Q = 16 have been defined. Seven differ-
ent primaries are specified in the standard. Some combinations of primaries are useless,
because their Euclidean distance is too small. Other combinations of primaries are not
meaningful, since the corresponding spectral power distribution (SPD) is undesired. Out
of the 73 possible combinations, nine valid tri-chromatic sets (so-called color band combi-
¡ ¢

nations) have been defined in the standard. The most appropriate color band combination
for the application under investigation is selected and implemented. Since the Euclidean
distances between the three selected primaries are not equidistant in the x y projection of
the CIE 1931 XYZ color space, the gamut is not symmetrical. Hence, the Q symbol coordi-
nates are not equidistant in practice either. Details will be presented in Chapter 6.

Tx data xp
Color Intensity PG LEDG
[xR, yR]
[xG, yG]
[xB, yB]
Figure 4.12 Block diagram of a CSK modulator. The three primaries are assumed to be red (R),
green (G), and blue (B).

A block diagram of a CSK modulator is laid out in Fig. 4.12. The mapping of the log2 Q data
bits onto the x y coordinates is done by a color encoder. This operation is performed in the
first step. The color encoder delivers the two CIE 1931 coordinates [x p , y p ]. In the second
step, chromaticities are converted into intensities. Given [x p , y p ] and the CIE 1931 coor-
dinates [x R , y R ], [x G , y G ], and [x B , y B ] of the selected primaries, by means of an intensity
modulator three non-negative intensities P R , P G , and P B are computed according to
xp = P R xR + P G xG + P B xB (4.21)
yp = PR yR + PG yG + PB yB , (4.22)
80 4 Modulation Schemes for Optical Wireless Communications

subject to the constraint P R +P G +P B = 1. The intensities are independently converted into

the analog domain, before finally feeding the corresponding LED.
Detection can be either performed in form of light intensity detection or in form of chro-
maticity detection. Concerning intensity detection, the minimum Euclidean distance in
the signal space, responsible for the bit error performance, is identical for all nine color
band combinations [Sin14]. As a consequence, the error performance is the same for all
sets. A maximum-likelihood receiver selects the log2 Q data bits, which best fit to the re-
ceived pattern of intensities. However, if detection is performed in the chromaticity space,
the error performance for various color band combinations is different. Furthermore, in
chromaticity-based detection intensity demodulation must be performed before detec-
tion, which causes noise amplification. Therefore, for chromaticity-based detection, the
error performance is worse than for intensity detection.
Conventionally, CSK is based on a single tri-chromatic LED set. CIE 1931 based con-
stellation point optimizations have been studied in [Dro10] using billiard algorithms, in
[Mon14b] exploiting the interior point method, and in [Jia15] using the extrinsic informa-
tion transfer (EXIT) chart method. In [Sin14], an extension to a single quad-chromatic LED
set has been proposed. Due to increased Euclidean distances between the symbols, the
bit error performance improves. The main disadvantages are the higher computational
complexities at the transmitter side (in terms of an additional LED including driver circuit
plus DAC) as well as at the receiver side (in terms of an additional photodetector including
transimpedance amplifier and ADC). The work has been further extended in [Sin15] tak-
ing 256-CSK, 1024-CSK and 4096-CSK into account. A maximum bandwidth efficiency of
24 bit/s/Hz is reported. Generalized CSK (GCSK) that operates under varying target colors
independent of the number of LEDs has been disclosed in [Mur15].
Besides these optimizations and generalizations, several combinations involving CSK have
been developed. In [Lun14], the combination of CSK with constant-rate differential PPM
(DPPM) has been studied. Synchronization could be simplified while maintaining illu-
mination control. Similar concepts of merging CSK with PPM have been investigated in
[Del14, Per15].

4.4.2 Digital Color Shift Keying (DCSK)

In digital CSK (DCSK), multicolor LEDs are used where each LED element simply is oper-
ated in “on/off” mode [Mur16]. The information is encoded in the intensities of activated
colors. In contrast to conventional CSK, DCSK completely avoids DACs. Also, the driver
circuitry is much simpler. A possible degradation of color quality (in terms of color render-
ing) is the main burden of DCSK. DCSK can be interpreted as an incident of superposition
modulation, cf. Section 4.7.
A block diagram of a DCSK transmission scheme is shown in Fig. 4.13. At the transmitter
side, NTx multi-color LEDs are used. Each multi-color LED can emit Ncol different colors,
where Ncol > 1. For example, in the case of RGB LEDs we have Ncol = 3. The total number
of transmit apertures is NT := NTx · Ncol .
At the receiver, NRx color sensors are applied, each resolving the same Ncol colors. The
total number of receive apertures is NR := NRx · Ncol . A single color sensor (NRx = 1) is
4.4 Color-Domain Modulation 81


Digital Driver Circuit

NT ports
Tx data DCSK Rx data


+ colored filters

Figure 4.13 Block diagram of a DCSK transmission scheme. The photodetectors should be
assisted by colored filters.

sufficient, but the error rate performance improves with multiple color sensors (NRx > 1).
Conceptually, the receiver is not different from a conventional CSK receiver.
The main concept of DCSK is to switch the NT LED elements individually “on” or “off”, like
in OOK. In accordance with this goal, each LED element must have its own driver. The
hardware drivers are simple, however, since a current control is sufficient. A degradation
due to the nonlinear behavior of LEDs is avoided as far as possible. Furthermore, signal
processing is simple. Digital-to-analog conversion is not necessary. The intensity of a cer-
tain color is represented by the total amount of “on” incidents of that color. Therefore,
NTx + 1 different intensities can be achieved per color, if all LED elements of that color
are of the same type. Consequently, altogether (NTx + 1)Ncol distinct constellation points
exist in the Ncol -dimensional color space. In order to guarantee flicker avoidance, most
of these constellation points should not be taken into account, however, because of data-
dependent intensities. As a possible workaround, in [Mur16] it is suggested to switch “on”
exactly NTx LED elements at a certain time. This strategy keeps a constant optical intensity.
The effective number of constellation points reduces to NTx +N col −1
¡ ¢
NTx . Concerning dimming
control, pulse width modulation (PWM) is appropriate.
In order to verify the suitability of DCSK, in [Mur16] the goal has been to mimic conven-
tional CSK by DCSK, where conventional CSK is taken from the IEEE 802.15.7 VLC stan-
dard [IEEE802]. 4-CSK, 8-CSK and 16-CSK symbols, respectively, are represented by the
total amount of “on” intensities at a given time index. Besides conventional CSK, also (i)
CSK with a linear variable current driver including pre-distortion and (ii) CSK with linearly
controlled LED are considered for reference, because nonlinear effects are degrading con-
ventional CSK. For NTx = 9 RGB LEDs, the BER performance of DCSK is similar to that of
the advanced CSK approaches (i) and (ii). Conventional CSK, however, has a worse perfor-
mance for all three symbol cardinalities under investigation.

4.4.3 Color Intensity Modulation (CIM)

Color intensity modulation (CIM) allows both the instantaneous transmission color and
intensity to be free from the target color and intensity [Ahn12]. Only the average color and
intensity shall meet the target. CIM is less constrained than CSK. In return, channel capac-
ity is slightly improved. Contrarily, intensity fluctuations are larger and a joint optimization
of the symbol constellation, of the symbol probabilities, and the labeling is demanding.
82 4 Modulation Schemes for Optical Wireless Communications

4.4.4 Metameric Modulation (MM)

In metameric modulation (MM), more than three LEDs are applied in order to optimize
color quality. Towards this goal, multiple tri-chromatic sets are defined [But12]. Each set
generates its own gamut. At a given time index k, just a single tri-chromatic set is activated,
i.e., three LEDs are “on”. Which set is actually activated is data-dependent. Hence, the key
concept of MM is to map the data symbols onto different tri-chromatic sets. When de-
signing the sets carefully, their spectral power distributions are metamerically equivalent.
In other words, the human eye is not able to distinguish between the chosen set. A thor-
oughly designed digital receiver, however, is able to detect the activated set (and hence to
recover the data) without ambiguity.
Let us denote the number of different colors by M , where M > 3. Moreover, let us assume
without loss of generality that each primary set is generated by three light sources. Hence,
in the CIE 1931 xy chromaticity diagram the gamut of each set is triangular-shaped. Alto-
gether, M
¡ ¢ ¡M ¢
3 possible primary sets exist. Let us consider Q out of the 3 possible primary
sets, where Q is preferably a power of two.
In [But12], an example is given using M = 4 LEDs: red, green, cyan, and blue. Hence,
¡M ¢
3 = 4 primary sets exist. We consider just Q = 2 of these sets. As illustrated in Fig. 4.14,
the primaries of the first set are red, green, blue, and of the second set they are red, cyan,
blue. The gamuts of both sets are considerably overlapping. The perceived light color can
be made the same by set-wise controlling the intensities of the light sources.


0.8 540

0.5 580
0.4 600


0.1 480

0.0 0.13800.2 0.3 0.4 0.5 0.6 0.7 0.8
Figure 4.14 CIE 1931 xy chromaticity diagram. The tri-chromatic sets have been proposed in
[But12] for metameric modulation. Although the blue source is active in both sets, its intensity is
different in each set. The same holds for red. Otherwise, different white points are achieved.
4.4 Color-Domain Modulation 83

As proposed in [But12], only a single tri-chromatic set is “on” at a time. Hence, log2 Q bits
are transmitted per time index. For example, bit 0 is assigned to the first set and bit 1 to the
second set, respectively, in our illustrative example. The embedded modulation is invisible
to humans due to metamerism. The color rendering index would improve by taking the
remaining two primary sets into account as well. Try to identify them in Fig. 4.14.
The key advantage of MM compared to CSK is that the perceived light color is independent
of the data, for any possible data combination and at any time instant. The main disad-
vantage of MM is the increased hardware effort. Set-wise intensities must be adjusted pre-
cisely. The additional (cyan) LED needs an extra driver and control support. Furthermore,
the receiver must be able to detect the active tri-chromatic set. This adds extra computa-
tional complexity and increased optical effort (like filtering and an additional photodetec-
tor plus amplifier) at the receiver side.
An alternative to MM is generalized color modulation (GCM) presented in [Das13]. Color
is data-independent. GCM applies the CIELUV color space to make use of improved per-
ceptual uniformity.

4.4.5 Deep-Learning-Based Multicolor Transceiver Design

Deep learning (DL) is a popular subarea of artificial intelligence (AI) and neuronal net-
working (NN). DL provides computational models that are composed of multiple process-
ing layers to learn representations of data with multiple levels of abstraction [LeC15]. Ap-
plications include pattern recognition, object recognition, object detection, among many
other fields. Although DL is trendy in the area of RF communications, the number of
journal publications in the area of optical wireless communications presently is small. In
[Lee18a], an end-to-end transceiver design for RGB LEDs is proposed, whereas in [Lee18b]
focus is on binary signaling.
DL is applicable to any equivalent discrete-time multiple-input multiple-output (MIMO)
channel model of the form

y = H · x + n. (4.23)

This generic model is relevant for various scenarios in OWC, including multi-color signal-
ing, LED arrays, pixelated light sources, and so forth. Further details on this channel model
will be discussed in the next chapter.
Without loss of generality, the RGB scenario presented in [Lee18a] will be studied next as a
possible MIMO application. This work is based on an autoencoder (AE). An AE is a feed-
forward neural network with a single input layer, a single or multiple hidden layers, and
a single output layer. The output layer has the same number of nodes as the input layer,
aiming to regenerate the corresponding input nodes. In the training phase, the AE network
is fed by training sequences subject to the constraint that the target outputs of the AE are
as close as possible to the associated inputs.
The VLC application under investigation is visualized on the left-hand side in Fig. 4.15. Al-
though modulator, physical channel, and demodulator can jointly be modeled by a single
AE network (between data vector u and respective estimates û), it is advisable to have sep-
arated networks representing transmitter and receiver sides, respectively, in order to cope
84 4 Modulation Schemes for Optical Wireless Communications

x1 y1 s1 ŝ1
LED1 PD1 h1

u x2 y2 û s2 ŝ2


x3 y3
sG ŝG
Figure 4.15 Block diagram of VLC scenario under consideration (left) and single-layer
autoencoder network (right).

with individual constraints. One of these AE networks is shown on the right-hand side in
Fig. 4.15. Since the modulator is part of the training process, the overall BER performance
likely is better than for a non-optimized transmitter structure.
For reasons of conciseness, focus is on a single hidden layer. A generalization is straightfor-
ward. Let us denote the number of nodes of the input/output layer by G, and the number
of nodes of the hidden layer by H . Furthermore, let s(t ) = [s 1(t ) , s 2(t ) , . . . , sG(t ) ] denotes the t -th
input vector, 1 ≤ t ≤ T . This vector is mapped onto

h(t ) := φ1 W1 s(t ) + b1 ,
¡ ¢

where h(t ) = [h 1(t ) , h 2(t ) , . . . , h (t

]. The t -th output layer is yielded as

ŝ(t ) := φ2 W2 h(t ) + b2 ,
¡ ¢

where ŝ(t ) = [ŝ 1(t ) , ŝ 2(t ) , . . . , ŝG(t ) ]. φ1 (.) and φ2 (.) are activation functions, W1 and W2 are H × G
respectively G × H weight matrices, and b1 and b2 are bias terms. The objective is to mini-
mize a cost function between ŝ(t ) and s(t ) by optimizing weight matrices and bias terms. In
accordance with this goal, proper activation functions need to be selected. Common ex-
amples are linear activation φ1/2 (a( j )) = a j and sigmoid activation φ1/2 (a( j )) = 1/(1+e −a j ),
among others, where a( j ) is the j -th element of vector a. The mean squared error and the
Kullback-Leibler divergence serve as possible cost functions. The number of iterations, T ,
should be sufficiently large.
The multi-color setup under investigation is applicable to all color-domain modulation
schemes, including CSK, DCSK, CIM, MM, and GCM. In this context, important con-
straints should be considered: peak-intensity constraint, color quality constraints, flicker
constraint, and dimming constraint. Towards this goal, it is advisable to employ several
layers per AE network and transmitter-side post-processing in order to fulfill the men-
tioned constraints. Also, a proper model for the channel matrix H needs to be defined. As
there is interference between adjacent colors, H is of Toeplitz-like structure.
In [Lee18a], the AE concept is matched to CIM. Three hidden layers are used, thereof two
at the transmitter side, together with post-processing to cope with the lighting constraints.
The symbol error rate is reported to outperform that of the corresponding benchmark sys-
tem. Concerning a detailed description of the AE design and the performance results, the
interested reader is referred to [Lee18a].
4.5 Multi-Carrier Modulation (MCM) 85

4.5 Multi-Carrier Modulation (MCM)

In multi-carrier modulation schemes, data is assigned to several subcarriers in parallel. In
the context of IM/DD, transmission is still performed at baseband.
As aforementioned, equalization is simpler for MCM techniques. Together with a cyclic
prefix, equalization can be performed by a so-called 1-tap equalizer. In the presence of
multipath, the energy efficiency can be optimized by adaptive bit allocation and power
loading. Low-frequency subcarriers can be avoided in order to mitigate an intentional DC
bias, unintentional DC wander, and/or low-frequency interference. In hybrid VLC/Wi-Fi,
hybrid VLC/LTE and hybrid VLC/PLC systems, MCM functionality can be re-used. Also,
multiuser communication is simplified by MCM.
Among the drawbacks of MCM is the nonlinear characteristic of solid-state light sources.
For large signal variations, the emitted optical power is not exactly proportional to the for-
ward current. Predistortion and biasing are useful in order to linearize the channel. The
high peak-to-average power ratio (PAPR) of MCM creates negative effects (but also positive
ones, as discussed below). Moreover, the constraint of non-negative real-valued signaling
affects both power efficiency and bandwidth efficiency of MCM in a negative way. For the
same bandwidth efficiency, FFT processing must be conducted at twice the speed com-
pared to MCM in radio systems.
A popular real-valued MCM scheme is discrete multitone transmission (DMT). DMT is a
special case of orthogonal frequency-division multiplexing (OFDM). For didactic reasons,
we will explain OFDM first before discussing several DMT alternatives, albeit OFDM in
original form is not suitable for IM/DD. A thorough overview of numerous versions of DMT
and related MCM schemes is provided subsequently.

4.5.1 Orthogonal Frequency-Division Multiplexing (OFDM)

OFDM is a linear multi-carrier modulation scheme. In multi-carrier systems, the data sym-
bols are transmitted on N subcarriers in parallel. In linear multi-carrier systems, each sub-
carrier may be modulated with an individual linear modulation scheme. Before transmis-
sion, all subcarrier signals are linearly superimposed. In complex baseband notation, an
OFDM signal can be expressed as [Sal67]

1 X NX−1
s(t ) = p x n [k] · g n (t − kTu ), (4.26)
N k n=0

where k is the time index, N is the number of information carrying subcarriers (usually N
is an even integer), x n [k] is the k-th data symbol of the n-th subcarrier (n ∈ {0, 1, . . . , N −1}),

½ ¡ ¢
exp j 2π(n/Tu )t for 0 ≤ t ≤ Tu
g n (t ) = (4.27)
0 else

is the baseband pulse of the n-th subcarrier, Tu := N · T is the OFDM frame duration, and
1/T is the symbol rate. For highest possible bandwidth efficiency, the data symbols x n [k]
86 4 Modulation Schemes for Optical Wireless Communications

should be complex-valued. Alternatively, the baseband pulse of the n-th subcarrier can be
written as
µ ¶
¡ ¢
g n (t ) = exp j 2π(n/Tu )t · rect , (4.28)

where rect(t /Tu ) is defined to be a causal unit-gain rectangular pulse of duration Tu . Due
to the normalization 1/ N in (4.26), the average transmit power is one, independent of the
number of subcarriers.
In the frequency domain, each subcarrier is a shifted sinc pulse. The sinc pulses are
equidistantly spaced in the frequency domain. The spectra of all N subcarriers overlap,
which is the key recipe towards spectral efficiency. The n-th subcarrier is centered at fre-
quency f n = n/Tu = n/(N T ). The first subcarrier (n = 0) is called DC subcarrier, the middle
subcarrier (n = N /2) is called Nyquist tone.
Although the equidistantly-spaced subcarriers with spacing 1/Tu are mutually overlapping,
they are orthogonal, since

ZTu ½
1 1 for n = j
g n (t ) · g ∗j (t ) d t = n, j ∈ {0, 1, . . . , N − 1}. (4.29)
Tu 0 for n ̸= j

Due to this orthogonality property, by means of matched filtering the data can be recov-
ered without any information loss compared to a single-carrier Nyquist system. According
to the theory of Fourier series,

Z u
x n [k] = s(t ) · exp(− j 2πnt /Tu ) d t . (4.30)

The matched-filter receiver is an integrate & dump receiver, applied to back-rotated sub-
carrier signals. The analog matched-filter output signal is
1 ∗ ! 1 ∗
y n (t ) = r (t ) ∗ g (Tu − t ) = r (t ) ∗ g (t )
Tu n Tu n
= r (τ) · g n∗ (t − τ) d τ. (4.31)
After sampling once per OFDM frame duration,

Z u
¯ 1
r (τ) · exp − j 2π(n/Tu ) τ d τ
¡ ¢
y n (t )¯ := y n [k] = (4.32)
t =(k+1) Tu Tu
k Tu

is obtained. Fig. 4.16 shows a block diagram of an OFDM transmission system with
matched-filter receiver. Due to orthogonality, there is no cross-talk in the frequency
domain between an arbitrary input carrier n and an arbitrary output carrier j (n, j ∈
{0, 1, . . . , N − 1}), if n ̸= j . An OFDM transmission system is like N parallel, independent
transmission systems. Each subcarrier forms a subchannel.
In the time domain, orthogonality is given as well. In the presence of dispersive channels,
a classical equalizer is avoidable, as discussed later.
4.5 Multi-Carrier Modulation (MCM) 87

gn (t) = exp(j2π n t/Tu ) · rect(t/Tu ) Bank of matched filters

k Tu
x0 [k] y0 [k]
g0 (t) g0∗ (Tu − t)

k Tu
x1 [k] n(t) y1 [k]
g1 (t) g1∗ (Tu − t)

x[k ′ ] s(t) r(t) y[k ′ ]

S/P + + P/S

k Tu
xN −1 [k] ∗
yN −1 [k]
gN −1 (t) gN −1 (Tu − t)

Symbol duration T Frame duration Tu = N T Symbol duration T

Figure 4.16 Block diagram of an OFDM transmission system.

The power spectral density (PSD) of an OFDM transmit signal depends on the number
of subcarriers. (The power spectral density (PSD) is not to be confused with the spectral
power distribution (SPD) of a light source.) The PSD is calculated as

Φss ( f ) ∼ |G Tx ( f − f n )|2 , where g Tx (t ) := rect(t /Tu ) b r G Tx ( f ) = sin(π f Tu ) .
n=0 π f Tu

If all subcarriers have the same average power and if the data symbols on all subcarriers
are uniformly random distributed with zero mean, one obtains
−1 sin2 π( f − f )T −1 sin2 πN ( f T − n/N )
¡ ¢ ¡ ¢
n u
Φss ( f ) ∼ ¢2 = ¢2 . (4.34)
π( f − f n )Tu πN ( f T − n/N )
¡ ¡
n=0 n=0

Fig. 4.17 shows the normalized PSD for N = 16, 64, 256, and 1024 subcarriers. With increas-
ing N , the PSD of an ideal low pass filter (sometimes called Nyquist system) with double-
sided bandwidth B = 1/T is approached. The side lobes and out-of-band radiation/illumi-
nation diminishes with increasing N . For this reason, OFDM in conjunction with a suffi-
ciently large number of subcarriers (and high-order subcarrier modulation) is bandwidth
efficient. The total bandwidth should be matched to the bandwidth of the light source (and
the coherence bandwidth of the physical channel in the presence of multipath).
The breakthrough of OFDM has been triggered by the insight that an OFDM signal

1 X NX−1 t − kTu
µ ¶
¡ ¢
s(t ) = p x n [k] exp j 2πnt /Tu rect (4.35)
N k n=0 Tu
88 4 Modulation Schemes for Optical Wireless Communications

0 0
10 N=16 10 N=64
N=infinity N=infinity
-1 -1
10 10

-2 -2
10 10

-3 -3
10 10

-4 -4
10 -2 -1 0 1 2 10 -2 -1 0 1 2
fT fT

0 0
10 N=256 10 N=1024
N=infinity N=infinity
-1 -1
10 10

-2 -2
10 10

-3 -3
10 10

-4 -4
10 -2 -1 0 1 2 10 -2 -1 0 1 2
fT fT

Figure 4.17 Power spectral density of OFDM for different number N of subcarriers [Hoe13].

can be realized by means of an inverse discrete Fourier transform (IDFT) at the trans-
mitter side [Wei71]. In the first OFDM frame interval 0 ≤ t < Tu (i.e. for k = 0), we obtain

1 NX−1 ¡ ¢
s(t ) = p x n [0] exp j 2πnt /Tu . (4.36)
N n=0

Upon substitution t = mTu /N , we get

1 NX−1 ¡ ¢
s m [0] = p x n [0] exp j 2πnm/N , m ∈ {0, 1, . . . , N − 1}. (4.37)
N n=0

In the general case,

1 NX−1 ¡ ¢
s m [k] = p x n [k] exp j 2πnm/N , m ∈ {0, 1, . . . , N − 1}. (4.38)
N n=0

Due to periodicity exp( j 2π(N − n)m/N ) = exp(− j 2πnm/N ), one can distinguish between
positive and negative subcarriers. Equation (4.38) corresponds to an IDFT, despite the nor-
malization factor 1/ N . Hence, at the receiver side the data symbols can be recovered by
a discrete Fourier transform (DFT):

1 NX−1 ¡ ¢
x n [k] = p s m [k] exp − j 2πnm/N , n ∈ {0, 1, . . . , N − 1}. (4.39)
N m=0
4.5 Multi-Carrier Modulation (MCM) 89

If the number of information carrying subcarriers, N , is a power of two, modulation (4.38)

and demodulation (4.39) can efficiently be realized by means of an inverse fast Fourier
transform (IFFT) and a fast Fourier transform (FFT), respectively. This improves com-
putational complexity compared to an IDFT and a DFT, respectively. Typically N is not
a power of two, however. Therefore, the information carrying subcarriers are extended
by empty subcarriers in a clever way so that the overall number of subcarriers, NFFT , is a
power of two. Given any arbitrary number of active subcarriers N , NFFT is often taken to
be the next power of two. IFFT and FFT are usually realized with the same number of FFT
points, NFFT . This corresponds to oversampling by the ratio NFFT /N in the time domain.
Oversampling provides a smoother transmit signal in the time domain. In the frequency
domain, oversampling simplifies an analog filter design in order to suppress out-of-band
radiation/illumination. As a result, in both domains it is meaningful to select NFFT > N .

xn [k]

0 1 2 3 4 5 6 7 8 9 n

x′n [k]

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 n

Figure 4.18 Shuffling of frequency points before IFFT (N = 10, NFFT = 16) [Hoe13].

An extension with NFFT − N empty subcarriers corresponds to zero padding in the fre-
quency domain. Each data vector x[k] = [x 0 [k], x 1 [k], . . . , x N −1 [k]] of length N conceptually
is copied onto a data vector x′ [k] = [x 0′ [k], x 1′ [k], . . . , x N

−1 [k]] of length NFFT according to

 x n [k] for n ∈ {0, . . . , N /2 − 1}

x n [k] := 0 for n ∈ {N /2, . . . , NFFT − N /2 − 1} (4.40)

x n−NFFT +N [k] for n ∈ {NFFT − N /2, . . . , NFFT − 1},

see Fig. 4.18. The NFFT − N zeros must be inserted in the middle for proper oversampling.
Note that NFFT − N does not need to be an even number. As a result, the k-th power-
normalized OFDM sample is

s m [k] = p IFFT{x n′ [k]}, m, n ∈ {0, 1, . . . , NFFT − 1}, (4.41)

where the IFFT is defined as

IFFT{x n′ [k]} = x n′ [k] e j 2πmn/NFFT , m ∈ {0, 1, . . . , NFFT − 1}. (4.42)
NFFT n=0

At the receiver side, correspondingly an FFT is used for demodulation,

y n′ [k] = FFT{r m [k]}, m, n ∈ {0, 1, . . . , NFFT − 1}, (4.43)
90 4 Modulation Schemes for Optical Wireless Communications

where the FFT is defined as

FFT{r m [k]} = r m [k] e − j 2πmn/NFFT , n ∈ {0, 1, . . . , NFFT − 1}. (4.44)

Finally, the original sequence of subcarriers is reconstructed:

y n′ [k] for n ∈ {0, . . . , N /2 − 1}
y n [k] := ′ (4.45)
y n+N −N [k] for n ∈ {N /2, . . . , N − 1}.

Oversampling does not alter spectral efficiency. Without oversampling, up to NFFT data
symbols can be transmitted in a double-sided bandwidth of 1/T , if NFFT is sufficiently
large. With oversampling, N data symbols can be transmitted in a double-sided bandwidth
of N /(NFFT T ). The effect of oversampling on an OFDM frame is featured in Fig. 4.19. With
oversampling, the time-domain signal becomes more smooth due to interpolation. Given
N = 128 active subcarriers and NFFT = 256 frequency points, in this example NFFT −N = 128
zeros are inserted in subcarriers N /2, . . . , NFFT − N /2 − 1.

Real part
Imaginary part

Transmit signal


22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Time index

Figure 4.19 Effect of oversampling on an excerpt of an OFDM frame (N = 128, without

oversampling NFFT = 128 (black circles), with oversampling NFFT = 256 (white circles), 64-QAM).
The smooth straight and broken lines are obtained by 8-times oversampling.

So far, it seems that the only benefits of OFDM are the facts that out-of-band radiation/illu-
mination is small and that pulse shaping can be performed by FFT processing. The power
efficiency of OFDM is determined by the subcarrier mapping and by the power allocation.
If the channel is non-dispersive (i.e., if the signal bandwidth is smaller than the bandwidth
of the light source and the coherence bandwidth of the physical channel), the average noise
power is the same for all subcarriers. Accordingly, for non-dispersive channels the total
transmit power should be uniformly distributed across all subcarriers, referred to as uni-
form power allocation. As a consequence, the same modulation scheme should be used
on all subchannels.
4.5 Multi-Carrier Modulation (MCM) 91

However if the channel is dispersive, i.e. frequency selective, multi-carrier modulation be-
comes beneficial, because it provides more degrees of freedom for system optimization. In
dispersive channels, the average noise power, Nn , is different from subcarrier to subcarrier.
This scenario is illustrated in Fig. 4.20 for N = 8 subcarriers, 0 ≤ n ≤ N − 1. Let us assume
that the subchannels are statistically independent (due to orthogonality) and that the N
noise powers are known at the transmitter side. Then, the optimal power allocation is de-
termined by the so-called water-filling method, also known as the water-pouring solution
[Han06]. In water filling, the signal powers S n are optimized according to
S n + Nn = Θ for Ni < Θ
Si = 0 for Ni ≥ Θ. (4.46)
The “water level” Θ is chosen so that S n = P . It becomes evident from Fig. 4.20 that the
signal-to-noise ratio, S n /Nn , is different from subcarrier to subcarrier.


S3 S5 S
S0 6
S2 S7
S1 N4
N0 N5 N6
N2 N7
0 1 2 3 4 5 6 7 Subcarrier n
Figure 4.20 Water-filling method applied to OFDM.

Due to orthogonality, on each subcarrier a different number of info bits per data symbol
can to be used. The number of info bits per data symbol should be matched to the cor-
responding signal-to-noise ratio per subcarrier. If a subchannel is noisy or strongly at-
tenuated (like subcarrier 3 in Fig. 4.20), a binary modulation scheme may be applied, or
no information may be transmitted at all via this subchannel (like subcarrier 4 in Fig. 4.20).
Contrarily, on subchannels with a high signal-to-noise ratio (like subcarrier 1), a high-order
modulation scheme should be used. This is called bit-loading. Non-uniform power alloca-
tion in conjunction with bit-loading is only possible in multi-carrier systems. Both meth-
ods should be optimized jointly.
Another distinct advantage of OFDM is the handling of ISI. Any dispersive (i.e., frequency
selective) channel causes ISI. ISI destroys orthogonality. Without any means for ISI com-
pensation, OFDM would be even worse than single-carrier modulation for this reason. For-
tunately, it is fairly easy to extend the frame duration of OFDM, because according to (4.28)
OFDM employs rectangular pulse shaping. The artificial extension of the OFDM frame du-
ration Tu is called guard interval. Let us denote the length of the guard interval by ∆. Then,
the overall length of an OFDM frame is Ts := Tu + ∆. The subcarrier spacing remains to be
1/Tu . As depicted in Fig. 4.21, a guard interval can be realized in two different forms.
On the left-hand side of Fig. 4.21 the n-th baseband pulse g n (t ), n ∈ {0, . . . , N − 1}, is pre-
ceded by zeros. This method is called zero padding. On the right-hand side of Fig. 4.21
92 4 Modulation Schemes for Optical Wireless Communications

g1(t), g2(t) 1 1

g1(t), g2(t)
0 0

-1 -1

0 0.25 0.5 0.75 1 1.25 0 0.25 0.5 0.75 1 1.25

t/Tu t/Tu

Figure 4.21 Guard interval applying zero padding (left) and accordingly cyclic extension (right)
for the example of ∆ = Tu /4. For lucidity, only the subcarriers with frequency f 1 = 1/Tu and
f 2 = 2/Tu are considered.

g n (t ) is cyclically expanded. This method is called cyclic extension or cyclic prefix. In

both versions, the bandwidth efficiency is reduced by the factor ∆/Tu , because the guard
interval does not contribute to information transmission. If the impulse response of the
physical channel including light source does not exceed ∆, ISI is avoided completely when
the integrate & dump operation is limited to the interval ∆ ≤ t ≤ Ts of duration Tu . In
the case of cyclic extension, the orthogonality is completely maintained if the channel
does not change within the OFDM frame duration. The power loss is 10 log10 (Ts /Tu ) dB.
Zero padding destroys orthogonality, but no transmit power is wasted in the guard inter-
val. Fig. 4.22 shows a block diagram of an OFDM transmitter with cyclic extension. The
mapping of N information-carrying subcarriers onto NFFT subcarriers is called subcarrier
mapping, the remaining NFFT − N subcarriers are set to zero.
N subcarriers




NFFT point




Figure 4.22 Block diagram of an OFDM transmitter with cyclic extension.

The system-theoretical significance of the cyclic extension is that the linear convolution
between the channel impulse response h(t ) and the transmit signal s(t ), r (t ) = s(t ) ∗ h(t ) +
n(t ), transforms into a circular convolution ⊛. If the circular convolution holds in time do-
main, the convolution theorem of the DFT is applicable, i.e., the DFTs are multiplicative:

DFT{s n [k] ⊛ h n [k] + w n [k]} = DFT{s n [k]} · DFT{h n [k]} + DFT{w n [k]} . (4.47)
| {z } | {z } | {z } | {z }
y n [k] x n [k] Hn [k] Wn [k]
4.5 Multi-Carrier Modulation (MCM) 93

Correspondingly, each subcarrier n has an individual weighting factor Hn [k] ∈ C, where

n ∈ {0, . . . , N − 1}. In reality, the weighting factors are correlated in the time domain (i.e.,
with respect to k) and in the frequency domain (i.e., with respect to n). Given a known data
symbol x n [k] (“pilot symbol”), a simple channel estimator calculates

y n [k]/x n [k] = Hn [k] + Wn [k]/x n [k]. (4.48)

Vice versa, given the weighting factor Hn [k], the data symbol x n [k] can be estimated as

y n [k]/Hn [k] = x n [k] + Wn [k]/Hn [k]. (4.49)

According to the last formula, the influence of the channel is easy to compensate at the
receiver side. Several authors refer to 1-tap equalization. The set of weighting factors cor-
responds to the sampled transfer function of the dispersive channel. Therefore, an N point
IDFT transforms the weighting factors Hn [k], n ∈ {0, . . . , N − 1}, into the instantaneous im-
pulse response for a fixed time index k.

Altogether, the advantages and disadvantages of OFDM compared to a single-

carrier modulation system can be formulated as follows:
■ Advantages of OFDM:
– By means of a guard interval (zero padding or cyclic extension), ISI is avoid-
able without classical equalizer in the presence of a dispersive channel.
OFDM modulation/demodulation transforms a frequency selective channel
into N parallel non-dispersive channels. Therefore, 1-tap equalization is suf-
– If the number N of subcarriers is sufficiently large, out-of-band radia-
tion/illumination is small. Then, the bandwidth efficiency of OFDM is
N +∆ log2 Q bit/s/Hz. Without guard interval (∆ = 0) the double-sided band-
width approaches 1/T . In other words, OFDM approaches a Nyquist sys-
tem, which can still be realized in practice. With guard interval, however, the
bandwidth efficiency reduces by the factor ∆/Tu . Then, OFDM is comparable
to a single-carrier modulation scheme with roll-off r = ∆/Tu .
– OFDM offers the possibility of adaptive bit allocation and power loading
according to the water-filling method. Low-frequency subcarriers can be
avoided in order to mitigate an intentional DC bias, unintentional DC wander,
and/or low-frequency interference. For best bandwidth efficiency, the data
symbols should be complex-valued.
– Modulation and demodulation can be implemented by means of an
IFFT/FFT. Computational complexity is proportional to NFFT log(NFFT ).
– OFDM is flexibly re-configurable and suitable for multiuser communication.
Different users can be assigned different sets of subcarriers, called orthogo-
nal frequency-division multiple access (OFDMA).
– An all-optical FFT implementation has been proposed in [Hil10].
■ Disadvantages of OFDM:
– Due to the linear superposition of N statistically independent subcarriers,
the quadrature components of the baseband signals s(t ) are nearly Gaus-
sian distributed. The ratio between the peak power of the transmit signal
and the average signal power, called peak-to-average power ratio (PAPR),
94 4 Modulation Schemes for Optical Wireless Communications

is data-dependent and typically quite large. This provokes pros and cons.
On the one hand, a high PAPR is very useful in optical communications, as
proven in Chapter 3. Signal waveforms with high PAPR outperform wave-
forms with small crest factor with respect to the electrical received power
given the same average optical received power.
On the other hand, hardware limitations compensate this beneficial effect.
Solid-state lighting devices are nonlinear in the presence of large signal vari-
ations. The nonlinearity causes: (i) clipping noise, (ii) an unequal spacing be-
tween intensity levels (making symbol decisions more susceptible to noise),
and (iii) a loss of orthogonality. This creates out-of-band emission and de-
grades the bit error performances. Furthermore, LEDs and laser diodes are
peak-intensity constrained. Also, the relative radiant power of an LED de-
clines when plotted versus the forward current. Last but not least, the effi-
ciency of the driver hardware drops with increasing current range. Therefore,
OFDM is not as hardware friendly as other modulation schemes.
Orthogonality is destroyed if the output current of the photodetector is not
exactly proportional to the forward current of the solid-state light source. The
solid-state light source regularly is the bottleneck. Different techniques to
compensate for the induced nonlinearity distortions are presented and an-
alyzed in [Mes12], including FFT preprocessing, iterative signal clipping,
and channel coding. Also, predistortion is helpful in order to linearize the
dynamic range of the light source by attempting to invert the nonlinearity.
A problem with non-adaptive predistortion is that the nonlinear behavior of
optical sources is subject to change by several factors, one of which is the
temperature of the transmitter. Dynamic feedback is needed to modify the
model of the instantaneous nonlinear transfer function of the emitter [Nos16].
This makes the transmitter design more complex. Biasing is a low-cost al-
ternative to predistortion. In the case of biasing, for example by means of a
bias-T (to be introduced in Chapter 9), the light source is operated around a
certain operating point. Consequently, the useful modulation hub is less than
for two-level modulation schemes. This has a positive effect on speed, but
a negative impact on the signal-to-noise ratio. A supplement to predistortion
and biasing are PAPR reduction techniques. Hadamard matrices or DFT pro-
cessing can be used as precoders in MCM systems to decrease the PAPR
[Xia12, Wu14]. Among several other strategies, also pilot-assisted PAPR
reduction can be done [Pop14].
– Orthogonality also gets lost when the channel impulse response exceeds
the length of the cyclic prefix. In order to tackle severe delay spread, the
cyclic prefix (and therefore the OFDM frame duration) should be increased.
If the impulse response still exceeds the cyclic prefix, it can be shortened
by means of an adaptive channel-shortening receive filter. However, this
complicates receiver design. In contrast to wireless radio, Doppler spread is
no obstacle in optical IM/DD transmission.
Coherent optical OFDM (which is applicable only in connection with laser
diodes) additionally suffers from frequency offset and phase jitter. Electrical-
to-optical up-conversion and optical-to-electrical down-conversion require
lasers exhibiting a very sharp linewidth.
– If the number N of subcarriers is not sufficiently large, out-of-band radia-
tion/illumination occurs, unless filtering is implemented. Auxiliary filtering is
simplified, if upper and lower subcarriers are deactivated.
4.5 Multi-Carrier Modulation (MCM) 95

OFDM is popular in many wireless and wireline transmission systems, including coherent
optical systems [Arm09]. The combination of OFDM with channel coding is called coded
OFDM (COFDM). Alternatively, prior to OFDM modulation one may conduct spread-
spectrum modulation (CDMA). Afterwards, the chips are modulated onto the subcarriers.
This technique is known as multi-carrier CDMA [Han06].

4.5.2 Unipolar OFDM Versions: DMT, DCO-OFDM, PAM-DMT,


There are two problems of OFDM in conjunction with intensity modulation: the baseband
signal is complex-valued and bipolar. Hence, OFDM is not suitable for IM/DD. However,
several modifications exist making multi-carrier modulation an interesting option for VLC
and FSO communications. Five modifications will be introduced next. These versions
make use of symmetry properties.
Discrete multitone transmission (DMT) is a baseband version of OFDM. DMT is applied
in cable-based data transmission systems like DSL, but also in fiber-based and fiber-less
optical communications. The key idea is simple. Starting off from

¯ 1 NX−1
E {|x n [k]|2 } = 1,
¡ ¢
s m [k]¯ =p x n [k] exp j 2πnm/N , m ∈ {0, 1, . . . , N − 1},
OFDM N n=0

the data symbols of the negative subcarriers are substituted by the complex conjugate data
symbols of the positive subcarriers (i.e., x N −n [k] = x n∗ [k] ∀ n ∈ {1, 2, . . . , N /2 − 1}). The DC
subcarrier (n = 0) and the Nyquist tone (n = N /2) are not used. Due to the Hermitian
symmetry x N −n [k] = x n∗ [k], the transmit signal is real-valued for all time indices k, even if
the data symbols are complex-valued, because

x n [k] exp j 2πnm/N + x n∗ [k] exp − j 2πnm/N = 2Re x n [k] exp j 2πnm/N
¡ ¢ ¡ ¢ © ¡ ¢ª

and therefore

¯ 2 NX
Re x n [k] exp j 2πnm/N ∈ R,
© ¡ ¢ª
s m [k]¯ =p m ∈ {0, 1, . . . , N − 1}.
DMT N −2 n=1

Hermitian symmetry makes sense if N ≥ 4. In (4.51) the periodicity of the complex phasor,
exp( j 2π(N − n)m/N ) = exp(− j 2πnm/N ), is exploited. The term N − 2 in the denomi-
nator of the normalization factor takes into account that two (out of the N ) subcarriers are
always deactivated. In the absence of the DC subcarrier, the transmit signal is DC free. This
simplifies hardware effort, but does not avoid negative amplitudes. At the receiver side, the
data symbols can be recovered by means of a DFT:
N − 2 NX
−1 ¯ ¡ ¢
x n [k] = s m [k]¯ exp − j 2πnm/N , n ∈ {1, 2, . . . , N /2 − 1}. (4.53)
N m=0 DMT
96 4 Modulation Schemes for Optical Wireless Communications

Since all negative subcarriers are redundant and two subcarriers are vacant, the bandwidth
efficiency of DMT is reduced by about a factor of two compared to OFDM. In order to trans-
mit N /2 − 1 data symbols, DMT applies an N point DFT/FFT. In other words, in order to
transmit N − 2 data symbols, a 2N point DFT/FFT is necessary. The Hermitian symme-
try constraint x N −n [k] = x n∗ [k] therefore is a bottleneck both with respect to bandwidth
efficiency and DFT/FFT complexity, but avoids the extra complexity of a quadrature mod-
In IM/DD systems, a second constraint exists: the transmit signal must be non-negative at
all time indices k. Several solutions exist. The simplest workaround would be to add a pos-
itive bias term onto the time-domain output samples of a DMT modulator. This technique
is dubbed DC-biased optical OFDM (DCO-OFDM) [Car96]:
¯ 1 ³ ¯ ´
s m [k]¯ =p clip β + s m [k]¯ , m ∈ {0, 1, . . . , N − 1}, (4.54)
¯ ¯

where clip(x) := x for x ≥ 0 and zero else. Effectively, the bias term corresponds to a real-
valued DC component. The DC offset term β ∈ R+ must be large enough so that the proba-
bility of negative amplitudes is negligible. An optimization of the bias term has been inves-
tigated in [Dim13, Zha14, Lin16]. If the offset term is too small, excessive clipping would
occur, causing an error floor. Vice versa, if the bias is too large, the data communication
part becomes power inefficient because the power p spent for the bias term is not available
for data detection. The normalization factor 1/ 1 + β2 in (4.54) is not exact if clipping is
frequent. When choosing β = max 0, − min(s 0 [k], s 1 [k], . . . , s N −1 [k]) , clipping is avoided
¡ ¢

completely. The DC offset term is ignored at the receiver side, e.g. by means of AC cou-
pling. Whenever illumination is essential, the power inefficiency may be justified for some
VLC applications. Still, when overall energy efficiency is required, an alternative solution
is necessary. Another disadvantage of DCO-OFDM is the fact that the optimum bias β de-
pends on the modulation scheme. Particularly for adaptive modulation schemes with dif-
ferent cardinalities per subcarrier an optimization is difficult. In computing the optimized
subcarrier power allocation for DCO-OFDM, the water-filling equations (4.46) cannot be
used directly, since the clipping noise on each subcarrier depends on the power of all the
subcarriers. A proper solution is presented in [Bar12].
As an alternative of adding the (possibly adaptive) bias β during signal processing (“soft-
ware bias”), a real-valued bipolar DMT signal may be fed to a bias-T. In the latter case, the
DC offset is typically fixed and therefore not optimized with respect to data detection. The
big advantage of the latter case, however, is that the “hardware bias” supports illumination
nearly without loss of power efficiency.
In Fig. 4.23, the BER of DCO-OFDM on the AWGN channel is depicted as a function of the
bias β. A fixed bias has been optimized for each SNR value. At 10−6 , the optimum β values
are about 2.0, 3.0, 3.75 for 4-QAM, 16-QAM, and 64-QAM, respectively. In these numerical
results, N /2 − 1 = 127 out of N = 256 subcarriers are effectively modulated with complex-
valued data symbols. A cyclic prefix is not implemented here, because there is no benefit
on the AWGN channel. The SNR loss due to the DC bias is 10 log10 (1 + β2 ) dB. Addition-
ally, clipping noise degrades the BER performance (about 1 dB at 10−6 ). The total loss is
the horizontal gap between the curve of interest and the corresponding dotted line. The
bandwidth efficiency of DCO-OFDM is NN/2−1 +∆ log2 Q bit/s/Hz. Consequently, the relation-
4.5 Multi-Carrier Modulation (MCM) 97

N /2−1
ship between E s /N0 and E b /N0 is E s /N0 = N +∆ log2 Q · E b /N0 , where E s is the energy per
sample in time domain.

0 0
10 10
-1 -1
-2 -2
10 10

-3 -3
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 10 0
0 4 8 12 16 20 24 28 32 36 4 8 12 16 20 24 28 32 36
Eb/N0 in dB Es/N0 in dB

Figure 4.23 BER of DCO-OFDM vs. SNR per bit (left) and vs. SNR per symbol (right). The
dotted lines hold for β = 0 and suppressed clipping.

Compared to DCO-OFDM, a more elegant solution is to leave the real part of the data sym-
bols empty and to use bipolar PAM (i.e., a real-valued modulation scheme) in the imagi-
nary part: x n [k] := x n,Re [k]+ j x n,Im [k] = j x n,Im [k]. Taking the DMT constraints (Hermitian
symmetry, empty DC subcarrier and Nyquist tone) into account yields
¯ 2 NX
s m [k]¯ = −p x n,Im [k] sin(2πnm/N ), m ∈ {0, 1, . . . , N − 1}, (4.55)
DMT N −2 n=1
where E {(x n,Im [k])2 } = 1. The term N − 2 in the denominator of the normalization factor
takes into account that signal values s 0 [k] and s N /2 [k] are always equal to zero. Since the
sine function is odd, positive and negative signal values occur pairwise. Therefore, negative
signal samples can be clipped without causing any information loss:
p Ã
¯ 2 2
s m [k]¯ =p clip − x n,Im [k] sin(2πnm/N ) , m ∈ {0, 1, . . . , N − 1}.
PAM−DMT N −2 n=1
This version is dubbed PAM-DMT [Lee09]. The term 2 in the numerator of the normaliza-
tion factor compensates for the power loss due to clipping negative amplitudes. The noise
process, which is caused by clipping, is orthogonal to the transmit signal – it affects only
the real part after FFT processing. Hence, the clipping-noise process can easily and com-
pletely be rejected. The bandwidth efficiency of PAM-DMT is NN/2−1 +∆ log2 Q bit/s/Hz. Com-
pared to OFDM in conjunction with quadrature amplitude modulation (QAM-OFDM), the
bandwidth efficiency effectively is reduced by a factor of four. This is due to Hermitian
symmetry and real-valued modulation. Assuming cardinality Q in PAM, cardinality Q 2 can
be applied in QAM for obtaining a similar power efficiency. Compared to DCO-OFDM, the
spectral efficiency gap is still a factor of two. Concerning power efficiency, however, PAM-
DMT does not suffer from a DC bias.
In Fig. 4.24, the BER of PAM-DMT on the AWGN channel is plotted for 2-PAM, 4-PAM, 8-
PAM, and 16-PAM as a function of E b /N0 and E s /N0 , respectively. E s is the energy per sam-
ple in time domain. The relationship between E s /N0 and E b /N0 is E s /N0 = NN/2−1 +∆ log2 Q ·
98 4 Modulation Schemes for Optical Wireless Communications

E b /N0 . In these numerical results, N /2 − 1 = 127 out of N = 256 subcarriers are modulated
with real-valued bipolar data symbols. A cyclic prefix is not implemented (∆ = 0). E b /N0
is independent of N , but E s /N0 depends on N if N is small. In the limit of N = 4, a single
modulated subcarrier is mapped on four time samples. If N is sufficiently large, about N /2
modulated subcarriers are mapped on N time samples. The BER of PAM-DMT vs. E s /N0
can be upper bounded as
1 1 Es
P b ≤ erfc , (4.57)
2 αQ N0

where αQ = (Q/2)2 + αQ/2 has been defined in (4.14). For Q = 2 this bound is exact. The
BER performance of 2-PAM-DMT is the same as NRZ-OOK and 2-ASK in terms of E b /N0 .
For higher orders, Q-ary PAM-DMT outperforms Q-ary ASK given the same cardinality.

0 0
10 10
-1 4-PAM-DMT 10
-1 4-PAM-DMT
-2 -2
10 10


-3 -3
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 0 4 8 12 16 20 24 28 10 0 4 8 12 16 20 24 28 32
Eb/N0 in dB Es/N0 in dB

Figure 4.24 BER of PAM-DMT vs. SNR per bit (left) and vs. SNR per symbol (right).

Another elegant solution based on Hermitian symmetry is to leave all even subcarriers (in
frequency domain) empty. As a consequence, the negative part of the transmit signal (in
time domain) is redundant, and hence can be clipped without loss of information even in
the absence of a bias term [Arm06]. This version is called asymmetrically clipped optical
OFDM (ACO-OFDM). Odd subcarriers are loaded with complex-valued data. The band-
width efficiency of ACO-OFDM is NN+∆ /4
log2 Q bit/s/Hz. Compared to QAM-OFDM, ACO-
OFDM suffers from a 3 dB power loss (because the negative signal is clipped), and a factor
of four in spectral efficiency (since the even subcarriers are not used, and due to the Her-
mitian symmetry constraint).
In Fig. 4.25, the BER of ACO-OFDM on the AWGN channel is shown for 4-QAM, 16-QAM,
and 64-QAM as a function of E b /N0 and E s /N0 , respectively. E s is the energy per sample in
time domain. The relationship between E s /N0 and E b /N0 is E s /N0 = NN+∆ /4
log2 Q ·E b /N0 . In
these numerical results, N /4 = 64 out of N = 256 subcarriers are modulated with complex-
valued data symbols. A cyclic prefix is not implemented, as earlier. The BER of ACO-OFDM
can be upper bounded as
1 u 1 Es
P b ≤ erfct p . (4.58)
2 α Q N0

For Q = 4 this bound is exact. The BER performance of 4-ACO-OFDM is the same as NRZ-
OOK and 2-ASK in terms of E b /N0 .
0 0
10 10
-1 -1
-2 -2
10 10

-3 -3
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 0 4 8 12 16 20 24 10 0 4 8 12 16 20 24 28
Eb/N0 in dB Es/N0 in dB

Figure 4.25 BER of ACO-OFDM vs. SNR per bit (left) and vs. SNR per symbol (right).

Flipped OFDM (Flip-OFDM), revealed in [Fer12], is another variation of the same prob-
lem. Flip-OFDM is based on two consecutive OFDM frames. Firstly, Hermitian symme-
try is applied to produce a real-valued output signal in the time domain. This is the first
OFDM frame. The second OFDM frame is a copied version of the first OFDM frame, but
the signs of all samples of the second frame are inverted in the time domain. Finally, all
negative samples of both frames are clipped in the time domain. Flip-OFDM has the same
power and bandwidth efficiency as ACO-OFDM. Uniform OFDM (U-OFDM) is identical
with Flip-OFDM and has been published independently in [Tso12]. In order to perform
1-tap equalization, the channel needs to be stable over two OFDM frames.

Frequency Domain Time Domain


β x1, Re x2, Re x3, Re 0 x3, Re x2, Re x1, Re Add DC offset

0 x1, Im x2, Im x3, Im 0 −x3, Im −x2, Im −x1, Im Clip negative samples


0 0 0 0 0 0 0 0 Clip negative samples

0 x1, Im x2, Im x3, Im 0 −x3, Im −x2, Im −x1, Im


0 x1, Re 0 x3, Re 0 x3, Re 0 x1, Re Clip negative samples

0 x1, Im 0 x3, Im 0 −x3, Im 0 −x1, Im

Copy 1st frame into 2nd frame
0 x1, Re x2, Re x3, Re 0 x3, Re x2, Re x1, Re Invert signs of 2nd frame
0 x1, Im x2, Im x3, Im 0 −x3, Im −x2, Im −x1, Im Clip negative samples

Figure 4.26 Comparison of inherent unipolar OFDM techniques (N = 8 subcarriers, DC: Direct
current subcarrier, NT: Nyquist tone).
PAM-DMT, ACO-OFDM, and Flip-OFDM/U-OFDM belong to the class of inherent unipo-

lar OFDM techniques [Isl16]. All these schemes have a reduced bandwidth efficiency
compared to DCO-OFDM caused by an additional symmetry constraint besides Hermi-
tian symmetry, see Fig. 4.26. A performance comparison of DCO-OFDM, PAM-DMT, and
ACO-OFDM is given in [Sch11, Dis13], for example. Given almost the same spectral ef-
ficiency, it is fair to compare Q/2-ary DCO-QAM with Q-ary PAM-DMT and with Q 2 -ary
ACO-OFDM, respectively. Considering Q = 8 for example, 8-PAM-DMT has a similar BER
performance than 64-QAM-ACO on the AWGN channel. 4-QAM-DCO-OFDM is the winner
in this comparison.




NFFT point

N point




Figure 4.27 Block diagram of a unipolar OFDM transmitter employing Hermitian symmetry.

Fig. 4.27 extends the OFDM block diagram introduced in Fig. 4.22. The generic recipe to-
wards unipolar transmission is Hermitian symmetry. Additionally, an N point DFT is intro-
duced in Fig. 4.27. This optional DFT improves the PAPR. In the wireless radio community,
the technique is known as single-carrier FDMA (SC-FDMA). SC-FDMA is applied on the
LTE uplink. In the optics community, the technique is called DFT-spread OFDM [Wu14]
or optical single-carrier FDMA (OSC-FDMA) [Mos15]. Not shown in Fig. 4.27 is a predis-
tortion unit to be implemented after parallel/serial (P/S) conversion. Particularly in hybrid
systems combining OWC with OFDM-based radio communications, DMT is an interesting
candidate from a compatibility point of view.

4.5.3 Spectrally-Enhanced Unipolar OFDM: SEE-OFDM,


The goal of spectrally-enhanced unipolar OFDM is to eliminate the bandwidth efficiency

loss of the inherent unipolar OFDM techniques. As mentioned in the previous subsection,
PAM-DMT, ACO-OFDM, and Flip-OFDM/U-OFDM suffer from a factor of two in terms of
spectral efficiency compared to DCO-OFDM. With some spectrally-enhanced OFDM tech-
niques this gap can be closed completely, with others not entirely (due to latency, complex-
ity, and memory constraints). The key idea is to exploit the symmetry constraint, which
makes PAM-DMT/ACO-OFDM/Flip-OFDM/U-OFDM unipolar, either in time domain or
in frequency domain, see [Elg14, Wan15a, Isl15c, Low16, Isl15a, Isl15b].
Several techniques have been proposed to tighten the spectral efficiency gap between
ACO-OFDM and DCO-OFDM. Among these techniques are spectrally and energy effi-
cient OFDM (SEE-OFDM) [Elg14], layered ACO-OFDM (LACO-OFDM) [Wan15a], and
enhanced ACO-OFDM (eACO-OFDM) [Isl15c]. A comparison of these methods has been
4.5 Multi-Carrier Modulation (MCM) 101

presented in [Low16]. For high-order modulation schemes, layered ACO-OFDM performs

best among these alternatives.
Similar to these techniques tailored to ACO-OFDM, in [Isl15a] a method has been sug-
gested to close the spectral gap between U-OFDM and DCO-OFDM. This enhanced U-
OFDM (eU-OFDM) technique has been named generalized enhanced unipolar OFDM
Along the same lines, in [Isl15b] enhanced PAM-DMT (ePAM-DMT) has been presented.
This improvement fills the spectral gap between PAM-DMT and DCO-OFDM.

4.5.4 Hybrid Schemes: SO-OFDM, RPO-OFDM, ADO-OFDM,


OFDM has been matched to the needs of VLC in many articles. In this section, an overview
of selected hybrid schemes is given [Isl16].
In [Mos15], spatial optical OFDM (SO-OFDM) has been published. The key idea is to trans-
mit groups of OFDM subcarriers via different LEDs. With an increasing number of LEDs,
the PAPR reduces. In the limit when the number of LEDs is equal to the number of sub-
carriers, the PAPR reaches its minimum value of 3 dB, because each subcarrier emits a sine
wave. SO-OFDM is reported to be more robust to LED nonlinearities and outperforms
DCO-OFDM with respect to bit error performance. Additionally, low-PAPR optical single-
carrier FDMA (OSC-FDMA) has been developed in [Mos15], where different collections of
LEDs act as virtual users in a multiple-access scheme.
In reverse polarity optical OFDM (RPO-OFDM), unipolar OFDM is combined with PWM
[Elg13]. This permits a higher control of dimming support in VLC applications. RPO-
OFDM is a viable solution, despite the fact that the duty cycle needs to be known at the
receiver side.
The combination of ACO-OFDM on odd subcarriers and DCO-OFDM on even subcarri-
ers has been called asymmetrically DC-biased optical OFDM (ADO-OFDM) [Dis13]. The
clipping noise of the ACO-OFDM subsignal affects only the even subcarriers. The receiver
tries to cancel this component before estimating the DCO-OFDM subsignal. Given an op-
timized power allocation, ADO-OFDM has a better power efficiency than ACO-OFDM and
Hybrid asymmetrically clipped optical OFDM (HACO-OFDM) uses ACO-OFDM on the
odd subcarriers and PAM-DMT on the even subcarriers to improve the bandwidth effi-
ciencies of ACO-OFDM and PAM-DMT, respectively [Ran14]. The clipping performed in
ACO-OFDM distorts only the even subcarriers. Like in ADO-OFDM, the ACO-OFDM sub-
signal is detected first at the receiver side. Afterwards, the PAM signal is detected. Power
allocation is useful to make the BER of both schemes the same [Wan14b].
In polar OFDM (P-OFDM), the complex-valued OFDM output is converted into polar co-
ordinates [Elg15]. The radial and angular coordinates are sent in the first and second halves
of one OFDM frame (in time domain). Hermitian symmetry is avoided. If only the even
subcarriers are modulated with Q-QAM symbols, the first half of the complex-valued time-
domain signal is identical with the second half. Consequently, it is sufficient to transmit
the first half only. Therefore, the spectral efficiency is identical with DCO-OFDM. It has
been reported that P-OFDM has a better BER performance than ACO-OFDM.
In [Wu15], asymmetrically and symmetrically clipping optical OFDM (ASCO-OFDM) has
been suggested. In this modulator, ACO-OFDM is combined with symmetrical clipping
optical OFDM (SCO-OFDM). ACO-OFDM uses the odd subcarriers, SCO-OFDM the even
subcarriers. The bandwidth efficiency of ASCO-OFDM is 75 % of that of DCO-OFDM.

4.5.5 Carrierless OFDM (cOFDM)

In the IM/DD literature, it seems to be common understanding that complex-valued

OFDM needs to be applied either in connection with Hermitian symmetry in order to
obtain a real-valued transmit signal, or in conjunction with a quadrature modulator (i.e.,
an I/Q mixer) operating at radio frequency (RF). In this section, we compare the Her-
mitian symmetry solution with I/Q mixing. Although from an analytic perspective I/Q
mixing inherently produces Hermitian symmetry, there are differences with respect to
computational complexity. The computational differences motivate a closer investigation.
Quadrature modulation requires two balanced DACs plus a quadrature modulator. Fur-
thermore, this solution is subject to a frequency offset if conducted in the analog domain
(despite the fact that radio frequencies are not suitable in the application of interest). A
frequency offset destroys orthogonality.

S(f )

− 2T1 0 1
2T f

SBP(f ) = 2 S(f − f0) + S ∗(f + f0)

−f0 f0 > 2T f

SBP(f ) = 2 S(f − f0) + S ∗(f + f0)

−f0 f0 = 2T f
Figure 4.28 Top part: Baseband signal. Middle part: Bandpass signal after I/Q mixing with
f 0 > 1/(2T ). Bottom part: Bandpass signal after I/Q mixing with f 0 = 1/(2T ).

Consider a bandlimited complex-valued baseband signal with spectrum S( f ) (if the signal
is deterministic) or power spectral density Φss ( f ) (if the signal is stochastic), see the top
part of Fig. 4.28. It is well known that for real-valued signals the constraint S ∗ (− f ) = S( f )
and accordingly Φ∗ss (− f ) = Φss ( f ) applies. This is the so-called Hermitian symmetry. I/Q
mixing with arbitrary (but sufficiently large) carrier frequency f 0 inherently produces Her-
4.5 Multi-Carrier Modulation (MCM) 103

mitian symmetry, see the middle part of Fig. 4.28. Note that real-valued signals always have
positive and negative frequencies.
Given the fact that the double-sided bandwidth of fully-loaded OFDM is centered between
−1/(2T ) and +1/(2T ) if the number of subcarriers is sufficiently large so that out-of-band
illumination is vanishing small, c.f. Fig. 4.17, the smallest possible carrier frequency that
avoids aliasing is f 0 = 1/(2T ). This situation is depicted in the bottom part of Fig. 4.28. If
the known symbol duration T is adjusted precisely, neither at the transmitter side nor at the
receiver side frequency distortion and phase jitter happens. This purely digital solution we
dub carrierless OFDM (cOFDM). This nomenclature is in analogy to carrierless amplitude
and phase (CAP) modulation, c.f. Section 4.3.6, but should not be mixed-up with coded
We start off from a conventional, complex-valued OFDM baseband signal in the digital do-
main, followed by a digital I/Q mixer operating at baseband frequency f 0 = 1/(2T ), assum-
ing that the number of subcarriers is sufficiently large. Otherwise aliasing occurs. A simi-
lar concept has recently been revealed in [Wan16] for IM/DD use cases, but the baseband
signal has been digitally up-converted to an RF carrier in order to generate a real-valued
transmit signal. The RF carrier frequency f 0 has not been specified in [Wan16]. Besides
the preferential design of f 0 = 1/(2T ), any larger carrier frequency is allowed. This would
relax the aliasing problem, but would add to computational complexity and would shift the
spectrum out of the range of interest. For this reason we fix f 0 = 1/(2T ) in the numerical
results. The influence of the number of subcarriers is studied subsequently. All cOFDM
versions are suitable with or without cyclic prefix.

2 cos(πn/J)

N/2 + 1 symbols Interpolator x

J ≥2 IR IR+
N0 zeros + bip2uni
J ≥2
N/2 symbols Interpolator x

− 2 sin(πn/J)
Figure 4.29 Block diagram of carrierless OFDM (cOFDM) transmitter in Version 1
(N0 := NFFT − N − 1).

Three different versions are investigated. A block diagram of the most intuitive solution,
called Version 1, is shown in Fig. 4.29. Consider N + 1 complex-valued data symbols, typ-
ically Q-ary QAM symbols. For convenience, N is assumed to be an even number (N /2
subcarriers are centered around the DC carrier), although this is not a fundamental restric-
tion. The length of the cyclic prefix is denoted as ∆. First, an OFDM signal is generated
by means of an NFFT -point IFFT, where NFFT > N + 1. The remaining NFFT − N − 1 sub-
carriers are filled with zeros, known as zero padding in frequency domain. Due to sub-
sequent I/Q conversion, no Hermitian symmetry constraint is necessary. At this point it
is irrelevant whether the OFDM outputs are unipolar or bipolar. For proper operation of
the digital quadrature modulator, the OFDM signal must be oversampled. Towards this
goal, in Version 1 the complex-valued IFFT output signals are fed into two parallel digital
interpolators. Let the oversampling factor be denoted as J . Given f 0 = 1/(2T ), two-times
oversampling is sufficient (J = 2). In the numerical results, the interpolators are based on
an FIR filter with root-raised-cosine characteristics. Given a roll-off factor r , the FIR fil-
ter is frequency-flat over a double-sided bandwidth of (1 − r )/T . It is worth mentioning
that oversampling does not affect the spectral efficiency. But, in order to realize the digi-
tal interpolators (and the lowpass filters in the corresponding quadrature demodulator at
the receiver side), (N + 1)/NFFT ≤ 1 − r holds. Consequently, the bandwidth efficiency is
N +1
2(NFFT +∆) log2 Q bit/s/Hz. (The factor of two in the denominator accounts for the negative
frequencies.) From this point of view, r should be as small as possible. In the numerical
results, r = 0.1. Afterwards, the oversampled signal is up-converted by a digital I/Q mixer.
Note that cos(2π f 0 t ) = cos(πn/J ) if t = nT /J and f 0 = 1/(2T ). Complexity-wise it is inter-
esting to see that cos(πn/2) ∈ {0, +1, −1}, where n is the time index. The same applies to
sin(πn/2). In other words, no floating-point multiplications need to be performed if J = 2.
The multiplication with the orthogonal sequences [+1, 0, −1, 0, +1, 0, . . . ] and accordingly
[0, −1, 0, +1, 0, −1, . . . ] followed by linear superposition can be interpreted as code-division
multiplexing. The quadrature modulator delivers a real-valued waveform, as desired. If no
additional constraint is considered in OFDM processing, the output waveform is bipolar.
By means of bipolar-to-unipolar (bip2uni) conversion, e.g. DC biasing (like in DCO-OFDM)
and/or clipping (like in PAM-DMT and ACO-OFDM), a unipolar signal is finally obtained.

2 cos(πn/J)
J · NFFT-point IFFT

N/2 + 1 symbols I
N1 zeros + bip2uni
N/2 symbols x

− 2 sin(πn/J)
Figure 4.30 Block diagram of carrierless OFDM (cOFDM) transmitter in Version 2
(N1 := J · NFFT − N − 1).

Starting off with Version 1, the simplified solution depicted in Fig. 4.30 is derived, called
Version 2. In this solution, the NFFT -point IFFT is replaced by a J · NFFT -point IFFT. The
N + 1 data symbols are arranged as in Version 1, but (J − 1) NFFT additional zeros are in-
serted in the frequency domain. Therefore, J -times oversampling is inherently conducted
by IFFT processing. The interpolators introduced in Version 1 are obsolete. It is worth
mentioning that oversampling neither decreases spectral efficiency nor increases compu-
tational complexity, since IFFT processing can be conducted by the Goertzel algorithm in
the presence of many zeros in the frequency domain. The remaining operations are the
Motivated by Version 2, a further complexity reduction is suggested in Fig. 4.31, called Ver-
sion 3. Here, the main idea is to perform the frequency shift in the (zero-padded) frequency
domain before IFFT processing takes place. The subcarriers are cyclically shifted by NFFT /2
4.5 Multi-Carrier Modulation (MCM) 105

N2 zeros IR+

J · NFFT-point IFFT
I IR 0
x bip2uni
N + 1 symbols

N3 zeros

Figure 4.31 Block diagram of carrierless OFDM (cOFDM) transmitter in Version 3

(N2 := NFFT /2 − N /2, N3 := J · NFFT − (NFFT /2 + N /2 + 1)).

positions to the right, which corresponds to a frequency shift of 1/(2T ). As a result, the dig-
ital I/Q mixer is obsolete. Having said this, the constraint (N +1)/NFFT ≤ 1−r does not limit
N /2−1
spectral efficiency any more. Bandwidth efficiency is NFFT +∆ log2 Q bit/s/Hz. The subcar-
rier allocation is illustrated in Fig. 4.32 for a specific example. The frequency shift does not
cause any bandwidth extension, as can be seen in Fig. 4.32 when comparing Version 3 with
Version 1. Version 3 is faster than Version 2, mainly because the imaginary component does
not need to be computed. The BER performance of Version 2 and Version 3 is identical.

Version 1 (before I/Q mixing)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 n
Version 2 (before I/Q mixing)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 n
Version 3 (I/Q mixing is included)
N +1
B= NFFT · T1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 n
Bias 1
2T T

Version 4 (Hermitian symmetry)

N +1
B= · T1
NFFT (.)∗

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 n
Bias 1
2T T

Figure 4.32 Subcarrier allocation of all cOFDM versions under investigation given N + 1 = 13,
NFFT = 16, and J = 2. Data symbols are marked by filled circles and horizontal bars, zeros by
empty circles. After I/Q mixing in V1 and V2, the spectrum of V3 is obtained. The classical
Hermitian symmetry solution (V4) is shown as reference.
Finally, in Fig. 4.32 the classical Hermitian symmetry solution is visualized, referred to as
Version 4. The transmit signal generated by Version 4 is identical with the transmit sig-
nal produced by Version 3, if proper power normalization is done. In both cases, only the
real-valued IFFT output signal needs to be computed. Still, computational complexity is
different. In V3 about 50 % of the IFFT input symbols are zero. The Goertzel algorithm is
tailored to this situation.
In order to provide a fair BER comparison, the same conventional receiver is applied when
comparing the three versions. The receiver consists of a digital quadrature demodulator
matched to the digital quadrature modulator taken in Versions 1 and 2. The quadrature
components are filtered by two identically constructed square-root Nyquist filters, whose
parameters are matched to the interpolators, i.e., the input values are oversampled by a fac-
tor of J and symbol-rate decimation is performed at the outputs. Bandwidth and roll-off
factor are identical at transmitter and receiver side. Interpolators and receiver filters are re-
alized with (5 + 1 + 5)J + 1 coefficients each (5 precursors, 1 main pulse, 5 postcursors). The
quadrature demodulator is followed by NFFT -point FFT processing. As transmission via
the AWGN channel is considered, no cyclic prefix (∆ = 0) is implemented. All active sub-
carriers are QPSK modulated with equal power allocation. Bipolar-to-unipolar conversion
is achieved by adding a fixed DC bias β = 2 followed by zero-level clipping. The clipping
causes some out-of-band radiation. In all numerical results BER is plotted versus E b /N0 in
the electrical domain.
Fig. 4.33 verifies that up to N + 1 = NFFT (1 − r ) subcarriers can be supported if NFFT is suf-
ficiently large. Given r = 0.1 and NFFT = 256, the bound is N + 1 = 229 subcarriers. If less
subcarriers are active, the BER performance is the same. In the opposite case, the BER per-
formance degrades. Degradation is less for Versions 2 and 3. In the case of Version 3, the
BER performance can be improved by replacing the conventional receiver (that has been
chosen for a fair comparison) by an FFT receiver matched to the corresponding IFFT trans-
mitter. In this case, there is no performance degradation as long as N < NFFT , because there
is no external digital filtering involved. The same applies to Version 4.

0 0
10 10
N+1=227 N+1=227
-1 N+1=229 10
-1 N+1=229
N+1=231 N+1=231
N+1=233 N+1=233
-2 -2
10 10
2 2
10 log10(1+β ) dB 10 log10(1+β ) dB


-3 -3
10 10

-4 -4
10 10

-5 -5
10 10

-6 -6
10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Eb/N0 in dB Eb/N0 in dB

Figure 4.33 BER vs. SNR per bit for cOFDM Version 1 (left) and Version 2 (right) given
NFFT = 256 and J = 2 as a function of N + 1 subcarriers. Straight lines are for unipolar outputs
(β = 2), dotted lines represent bipolar outputs. For the conventional receiver under investigation,
the BER performance of V3 is the same as for V2. For an optimum receiver, the BER
performance of V3 does not degrade as long as N < NFFT .
But how about the minimum number NFFT of (I)FFT points in V1 and V2? This question is
answered in Fig. 4.34, where the BER is depicted as a function of NFFT . The ratio N /NFFT
is kept constant in this set of simulations. For NFFT ≥ 64, no performance impairment
is observed. This time, degradation is less for Version 1. In summary, given J = 2 and
f 0 = 1/(2T ), cOFDM versions V1 and V2 are suitable for NFFT ≥ 64 (I)FFT points, in which
case up to N +1 = NFFT (1−r ) subcarriers can be modulated with complex-valued data sym-
bols. For V3, the limit is N +1 = NFFT . There are no constraints how these active subcarriers
are loaded. The winner is Version 3, which delivers the same transmit signal as the classical
Hermitian symmetry solution (V4), but at lower complexity. Although focus has been on
IM/DD in this contribution, the concept is suitable for coherent systems as well. A gener-
alization to orthogonal frequency-division multiple access (OFDMA) is straightforward.

0 0
10 10

-1 -1
10 10

-2 -2
10 10


-3 -3
10 10
NFFT=8, N+1=7 NFFT=8, N+1=7
NFFT=16, N+1=13 -4
NFFT=16, N+1=13
10 NFFT=32, N+1=25 10 NFFT=32, N+1=25
NFFT=64, N+1=49 NFFT=64, N+1=49
-5 -5
10 NFFT=128, N+1=97 10 NFFT=128, N+1=97
NFFT=256, N+1=193 NFFT=256, N+1=193
-6 -6
10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Eb/N0 in dB Eb/N0 in dB

Figure 4.34 BER vs. SNR per bit for cOFDM Version 1 (left) and Version 2 (right) given J = 2 as
a function of NFFT and N + 1. Straight lines are for unipolar outputs (β = 2), dotted lines represent
bipolar outputs.

4.5.6 Non-DFT-Based Multi-Carrier Modulation: DHT, WPDM, HCM

Most research on MCM schemes is based on DFT processing. However, other transforms
have been studied as well. In this subsection, we give an overview on non-DFT-based MCM
techniques that have been investigated in conjunction with IM/DD. The main motivation
is improved robustness when taking nonlinear distortions and/or delay spread into ac-
count, although significant bandwidth and/or power efficiency improvements compared
to DFT-based MCM techniques cannot be expected. Another motivation is computational
In [Mor10, Zho14] (and even before in the area of DSL), the IDFT introduced in (4.38) has
been replaced by a discrete Hartley transform (DHT) for IM/DD applications:

1 NX−1
s m [k] = p x n [k] (cos(2πnm/N ) + sin(2πnm/N )) , m ∈ {0, 1, . . . , N − 1}. (4.59)
N n=0

The same operation is used at the receiver side in order to recover the data, since IDHT
and DHT are identical (“self-inverse property”). Fast versions of the DHT (called FHT)
have about the same computational complexity as the FFT, hence there is no complexity
difference. The main difference compared to DFT processing is that DHT processing deliv-
ers real-valued samples s m [k], as long as the data symbols x n [k] are real-valued (i.e., one-
dimensional), like in Q-ary ASK modulation. Consequently, no Hermitian symmetry is re-
quired. This fact, however, does not improve spectral efficiency compared to DCO-OFDM
and ACO-OFDM, because in the latter case two-dimensional signaling is possible. Accord-
ing to (4.59), the DHT waveform is bipolar. This problem can be either solved by a positive
bias (such as in DCO-OFDM) or a symmetry constraint (like in ACO-OFDM) [Mor10]. Also,
a cyclic extension can be used. As a matter of fact, on the AWGN channel spectral and
power efficiencies are proven to be identical for DFT-based MCM and DHT-based MCM.
This result can be explained by the fact that in both cases the kernel of the transform is
sinusoidal. Consequently, on dispersive or on nonlinear channels the same conclusion is
The situation changes with other kernels, however. In [Hua15], finite-length wavelet packet
functions are chosen as orthogonal basis functions. Similar to an OFDM signal in complex
baseband notation where we have (cf. 4.26)

1 X NX−1
s(t ) = p x n [k] · g n (t − kTu ), Tu = N T, (4.60)
N k n=0
in wavelet packet division multiplexing (WPDM) the baseband signal can be written in
the form

x l ,m [k] · φl ,m (t − kTl ), Tl = 2l T,
s(t ) = (4.61)
k l ,m

where l and m are the tree level and tree position of symbol x l ,m [k], respectively. The
wavelet packet functions φl ,m (t ) define the orthogonal basis. In IM/DD applications, it
is preferable to define the WPDM basis in the real-valued domain. The modulation can
be performed via the inverse discrete wavelet packet transform (IDWPT) using quadrature
mirror filters [Hua15], similar to the IFFT in OFDM. It is reported that WPDM outperforms
OFDM in terms of out-of-band illumination, PAPR, robustness to LED nonlinearity, and
channel dispersion. This fact is interesting for VLC, where high optical powers are desired.
In several contributions, DFT processing is replaced by the discrete cosine transform
(DCT). For the same spectral efficiency, on the AWGN channel the same power efficiency
and bit error rate is reported, at a lower computational complexity, however. Therefore,
this version is dubbed fast OFDM (FOFDM) [Zho15]. FOFDM can be implemented in con-
nection with a DC-bias (DCO-FOFDM), or in conjunction with ACO-OFDM (called ACO-
FOFDM) [Zho15], despite other variations. The one-dimensional inverse DCT (IDCT),
applied at the transmitter side, is commonly defined as

π (2m + 1)n
r N −1 µ ¶
2 X
s m [k] = Wn x n [k] cos , m ∈ {0, 1, . . . , N − 1}, (4.62)
N n=0 2N
½ p
1/ 2 for n = 0
where Wn := The data symbols x n [k] are real-valued. The
1 for n ∈ {1, 2, . . . , N − 1}.
subcarriers are mutually orthogonal. A possible generalization in the sense of faster-than-
Nyquist signaling is

π (2m + 1)n
r N −1 µ ¶
2 X
s m [k] = Wn x n [k] cos γ , m ∈ {0, 1, . . . , N − 1}, (4.63)
N n=0 2N
where γ < 1 is the bandwidth compression factor. The double-sided bandwidth is γ/T ,
i.e. smaller than 1/T . The price to pay is a loss of orthogonality. Consequently, the sub-
carrier data cannot be detected independently, but some form of interference cancellation
is necessary. This does not just add to receiver complexity, but also complicates the ap-
plication of high-order modulation schemes. The same concept is applicable to OFDM
as well, known as optical spectrally efficient frequency division multiplexing (O-SEFDM)
In [Nos16], Hadamard coded modulation (HCM) has been proposed. HCM makes use of
a binary Hadamard matrix to modulate the data. At time index k, consider a scaled data
sequence x[k] = [0, x 1 [k], . . . , x N −1 [k]]T of length N . The elements of vector x[k] are Q-ary
PAM symbols, where x n [k] ∈ {0, 1/(M − 1), 2/(M − 1), . . . , 1} for n ∈ {0, 1, . . . , N − 1}. The k-th
HCM signal s[k] = [s 0 [k], s 1 [k], . . . , s N −1 [k]]T is generated as

s[k] = HN x[k] + HN x[k], (4.64)

where HN is a binary Hadamard matrix of order N and HN := 1−HN its complement. Simi-
larly, x[k] := 1−x[k] is the complement of vector x[k]. The computational complexity of the
fast Walsh-Hadamard transform (FWHT) is on the order of N log N , similar to the FFT. Two
completely different transmitter structures have been disclosed in [Nos16]. The straight-
forward one employs a DAC which is fed by the HCM signal. Given an LED array structure
of arbitrary dimension, this is the only possible solution if each component of the LED ar-
ray cannot be modulated separately. Vice versa, if all components of an N × (Q − 1) LED
array can be modulated independently, a solution is presented which drives each LED ei-
ther in “on” or in “off” mode, given a duty cycle of 50 %. This second transmitter structure
completely avoids nonlinear effects – a unique advantage compared to DFT-based MCM.
However, the maximum possible average power is limited by half of the peak optical power.
An alternative variant of HCM, dubbed DC-reduced HCM (DCR-HCM), has been proposed
to reduce the power consumption by sending (s[k]−min s[k]) instead of s[k]. DCR-HCM is
applicable with the first transmitter structure only. When compared with DCO-OFDM and
ACO-OFDM considering delay spread and nonlinear effects, at higher illumination levels
according to [Nos16] HCM achieves higher performance gains. However, the performance
improvement over RPO-OFDM is minor.

4.6 Code-Division Multiplexing (CDM)

The classical task of multiplexing schemes is to combine multiple data streams of a sin-
gle user before modulation takes place. In code-division multiplexing (CDM), the data
streams are multiplied by layer-specific spreading sequences, where a layer is a synonym
for a data sequence. In this monograph, we interpret CDM as a modulation scheme, rather
than a multiplexing scheme. Emphasis is on optical CDM (OCDM), i.e., all signals are real-
valued and non-negative.
In baseband notation, the CDM transmit signal can be represented in the form

s(t ) = x n [k] · g n (t − kT ), (4.65)
k n=0
where k is the time index (related to one symbol period), N is the number of superimposed
layers, x n [k] is the k-th data symbol of the n-th layer (n ∈ {0, . . . , N − 1}), g n (t ) is the base-
band pulse of the n-th layer, and T is the symbol period. In binary OCDM, the data symbols
x n [k] are either 0 or 1. The baseband pulses can be written as [Sal89b]
g n (t ) = b n,k g T x (t − kTc ), (4.66)

where K = T /Tc is the spreading factor, Tc is the chip period, and bn is called spreading
sequence or signature sequence of the n-th layer. The elements b n,k of the spreading se-
quence bn (k ∈ {0, . . . , K − 1}) are known as chips. For simplicity, subsequently we assume
that the spreading sequence consists of K chips per symbol duration. If the spreading se-
quence would be longer, we would take consecutive chunks of length K out of the long
sequence. The spreading factor K determines the bandwidth extension. The layer-specific
spreading sequences bn are data-independent. In OCDM, g T x (t ) usually is a rectangular
pulse of duration Tc . The amplitude of the chips is either 0 or α, where α determines the
intensity. The order of the modulation scheme is N /K . In Fig. 4.35, two near-orthogonal
signature sequences g 0 (t ) and g 1 (t ) taken from [Sal89a] are shown. These sequences are of
length K = 32 and have a Hamming weight W = 4. The ratio W /K is equal to the duty cycle.
Hence, OCDM is suitable for dimming. In accordance with this goal, sets of sequences of
length K with Hamming weight W can be designed, where W /K is the dimming parameter.


0 T t

0 T t
Figure 4.35 Baseband pulses of two near-orthogonal signature sequences (K = 32, W = 4,
b0 = [1, 0, 0, 0|0, 0, 0, 0|0, 1, 0, 0|1, 0, 0, 0|0, 0, 0, 0|0, 0, 0, 0|0, 0, 0, 1|0, 0, 0, 0],
b1 = [1, 0, 0, 0|1, 0, 0, 0|0, 0, 0, 1|0, 0, 0, 0|0, 0, 0, 0|0, 0, 0, 0|0, 0, 0, 0|0, 0, 1, 0]).

Fig. 4.36 depicts an example of the transmit signal s(t ) consisting of N = 2 layers utilizing
the two baseband pulses illustrated in Fig. 4.35. If x n [k] = 0, no chips are transmitted. If
x n [k] = 1, the signature sequence is superimposed onto the remaining layers.
In the presence of orthogonal baseband pulses,

ZT ½
1 W /K for n = m
g n (t ) · g m (t ) d t = for n, m ∈ {0, . . . , N − 1}, (4.67)
T 0 else

the data streams can be separated at the receiver side without ambiguity and without per-
formance loss compared to a single layer by means of N parallel matched filters. An exam-
4.6 Code-Division Multiplexing (CDM) 111


0 T 2T t

Figure 4.36 Example of an OCDM transmit signal (N = 2, K = 32, W = 4, x0 = [1, 1], x1 = [1, 0]).

ple for truly orthogonal sequences is PPM:

b0 = [1, 0, 0, . . . , 0]
b1 = [0, 1, 0, . . . , 0]
bK −1 = [0, 0, 0, . . . , 1]. (4.68)
In the presence of non-orthogonal baseband pulses, power efficiency can be improved by
employing a multi-layer detector after the matched filter bank.
In practice, however, orthogonality is not the only essential criterion. Near orthogonality
corresponds to good autocorrelation properties. Good cross-correlation properties are es-
sential as well. For this reason, in [Chu89] so-called optical orthogonal codes (better: op-
tical quasi-orthogonal codes) are defined as a family of binary sequences with near-perfect
autocorrelation and cross-correlation properties. A (K ,W, λa , λc ) optical orthogonal code
C of length K and Hamming weight W is defined as follows:

■ Autocorrelation property: For any codeword bn = [b n,0 , b n,1 , . . . , b n,K −1 ] ∈ C ,

0 ≤ n ≤ N − 1, the inequality b n,k · b n,k⊕k ′ ≤ λa holds for any integer k ′ |mod K ̸= 0.
■ Cross-correlation property: For any pair of codewords bn , bm ∈ C , n ̸= m , the
inequality b n,k · b m,k⊕k ′ ≤ λc holds for any integer k ′ .

Here, ⊕ denotes the modulo-K addition. When λa = λc := λ, the notation of C can be sim-
plified as (K ,W, λ). For example, the two signature sequences shown in Fig. 4.35 constitute
a (32, 4, 1) code.
An interesting property of CDM is the fact that the N layers do not need to be superimposed
before transmission. Instead, N layers may be allocated to N distinct light sources. In this
case, superposition inherently takes place at the photodetector(s). This variant can be im-
plemented in a hardware-friendly fashion, particularly when all layers are binary. Further
details on hardware-friendly implementations will be presented in Chapter 7 in the context
of software-defined radio.
Furthermore, the N data sequences may be generated by different users. In this case, we
talk about code-division multiple access (CDMA) rather than code-division multiplexing
[Sal12]. CDMA is an alternative to other multiuser channel access techniques like time-
division multiple access (TDMA) or wavelength-division multiple access (WDMA). The
users may be synchronized in the time domain (synchronous CDMA) or not (asynchronous

4.7 Superposition Modulation (SM)

Superposition modulation (SM) is a family of pulsed modulation schemes matched to
digitally-controlled LED arrays. Each individual LED is operated in “on/off” mode. The
information is encoded in the sum of intensities. This summation inherently takes place
at the photodetector(s) without extra complexity, known as spatial summing architecture
(SSA) [Mos15]. In contrast to modulation schemes delivering a continuous-valued wave-
form (like MCM schemes), the signal space after superposition is quantized and hence not
fully exploited. But there are good reasons to superimpose two-level waveforms. As men-
tioned before, the main limitations of LEDs are limited peak power, limited bandwidth, and
their nonlinear characteristic. Two-level current sources prevent losses due to nonlinear
effects. The driver circuitry is simple, yet efficient, because current control management is
sufficient. Many hardware platforms offer a digital output interface, avoiding a DAC at the
transmitter side.
The general framework of SM is the superposition of two-level waveforms. There are sev-
eral examples published in literature which are special cases of SM. To date, traditional
examples include optical code-division multiplexing (OCDM) [Sal89b] and multipulse
PPM (MPPM) [Wil05b, Lee11]. Spatial modulation based on PPM has been proposed in
[Pop12]. A more recent development is the discrete power level stepping concept (DPLS
concept) disclosed in [Fat13]. The transmitter consists of several “on/off”-switchable emit-
ter groups. Each emitter group is controlled individually and radiates two-level optical in-
tensities. As intensities constructively add up (“additive mixing”), the total intensity is the
sum of the radiated intensities of all activated emitter groups. Therefore, the proposed
transmitter solution can generate several discrete intensity levels which can be used for
optical wireless signal transmission. Specifically, pulse amplitude modulation (PAM) can
be implemented that way [Li13]. In [Qia15], this discrete power level stepping concept
is called digitally controlled transmission and is applied to a micro-LED array. Given
N1 × N2 array elements, log2 (N1 · N2 ) bits can be transmitted per time index. A straight-
forward method is suggested in order to generate these bits. In a first step, a DCO-OFDM
signal employing 16-QAM, 64-QAM or 256-QAM is generated. In a second step, the DCO-
OFDM waveform is (7-bit) quantized and fed to a (27 = 128) micro-LED array. However,
this method is neither general nor matched to the rise/fall times of the LEDs. Based on a
micro-LED array, in [Zha13] an optical MIMO system is suggested. Gbps data rates have
been obtained under lab conditions by means of spatial multiplexing. Digital color shift
keying (DCSK) [Mur16] is another special case of SM, cf. Section 4.4.2. The second transmit
structure discussed in [Nos16] in conjunction with Hadamard coded modulation (HCM)
also feeds individual LEDs, but the average optical power is limited and the array dimen-
sion, N1 × N2 , must be matched to the order of the modulation scheme.
SM is superior compared to classical two-level modulation techniques including OOK,
PPM and PWM from a bandwidth efficiency perspective, because classical two-level modu-
lation techniques are designed for a single light source. Systems with multiple light sources
4.7 Superposition Modulation (SM) 113

enable multilevel signaling, which can be employed to use the available bandwidth more
efficiently [Bia15, Nos16]. Compared to OFDM/DMT techniques, SM benefits from the fact
that the intensity is proportional to the number of active LEDs, although each LED is a non-
linear device. Due to this nonlinearity, predistortion or biasing is required in OFDM/DMT
Recently, constrained superposition intensity modulation (CSIM) has been proposed
[For18]. A key feature is that “on” and “off” times are adapted to the rise and fall times of
the light sources. Hence, given an arbitrary solid-state light source and hence arbitrary
bandwidth limitation, the modulation scheme is matched to this imperfection. For exam-
ple, GaN-based LEDs have a fairly large depletion capacitance, which lengthens the fall
time. In CSIM, the data rate is boosted by time-shifting the individual waveforms. The
array dimension is arbitrary. Mature concepts from magnetic storage devices are borrowed
in order to encode the data streams subject to a minimization of the average number of
switching operations per information bit. A minimization of the number of switching
operations per information bit increases the overall power efficiency including the driver
circuit. In the remainder of this section, we present the CSIM concept according to [For18].
Focus will be on the time domain, but a generalization to frequency, spatial and/or color
domains is possible.


VS Vbias
DAC Bias-T
. . .

sM (t)

Figure 4.37 Comparison of analog-type transmitter hardware (left) and binary-switched

transmitter hardware (right).

ping concept disclosed in [Fat13]. A comparison of analog-type transmitter hardware and
binary-switched transmitter hardware is depicted in Fig. 4.37. The analog-type hardware
includes a DAC, a driver, and a bias-T. These devices are described in Chapter 9. In order
to prevent nonlinear distortions especially caused by nonlinear light sources, fairly small
intensity variations around the DC bias specified by the bias-T are allowed. This has a posi-
tive effect on speed, but a negative consequence on SNR. A high-speed DAC imposes a vital
contribution to the overall cost.
114 4 Modulation Schemes for Optical Wireless Communications

In contrast, DACs are completely omitted in the discrete power level stepping concept. The
driver inputs are binary. Among the advantages are low hardware complexity and a driver
efficiency close to 100 %, but only as long as the light sources are in steady state. State
changes have a negative impact on power dissipation. In [For18], a graph-based concept
tailored to CSIM is proposed to decrease the number of switching operations per bit and
therefore to cutback switching loss. Constrained switching has the supplementary benefit
of matching the switching speed to the dynamics of the light sources. In simple words, the
main idea of CSIM is to modulate the light sources jointly in a sophisticated procedure.




0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 t/T
Figure 4.38 Example for (2|3) constrained superposition intensity modulation. Both square
waveforms s 1 (t ), s 2 (t ) fulfill the minimum “on” time constraint d 1 = 3 and “off” time constraint
d 0 = 2. The intensities s 1 (t ), s 2 (t ) are normalized only for illustrative purposes. The
superimposed signal s(t ) is shown in the top part.

For example, let us consider the simplest nontrivial example featuring M = 2 light sources.
Suppose that each light source must be “on” for at least d 1 = 3 time indices and “off” for at
least d 0 = 2 time slots, respectively. Constraints like this depend on key parameters such
as rise time, fall time and heat flow, both for light sources and switches. Fig. 4.38, redrawn
from [For18], illustrates an intuitive example of (d 0 |d 1 )M = (2|3)2 CSIM. Parameter T in
this figure is called slot duration. Switching is only possible at integer multiples of the slot
duration. For classical modulation schemes, the slot duration is the same as the symbol
duration. CSIM is different in this sense, because of the (d 0 |d 1 )M constraints. Note that in
CSIM all input sequences are asynchronous in order to increase the degrees of freedom of
waveform optimization. Thus, the superimposed transmit signal has a higher variability,
although the intervals between transitions of the square waveforms are maximized. The
state transitions can be represented by a graph. This simplifies sequence optimization and
capacity analysis. Given M light sources, by means of linear superposition up to M + 1
intensity steps are resolvable. If the light sources stem from a single illumination fixture
and if each individual light source emits about the same peak intensity, the M + 1 intensity
steps are roughly equidistant.
4.8 Camera-Based Communication

Finally, in this section image-sensor-based communication aspects are introduced to the
reader. Traditionally, OWC communication is based on non-imaging optical systems uti-
lizing LED or laser light sources, where the information is embedded in data sequences.
Alternatively, still or moving pictures can be used as data sources, displayed by a screen,
a display, or an LED array. Rather than exploiting a conventional photodetector at the re-
ceiver side, the image can be captured by means of a 2D image sensor, such as a CCD or
CMOS array discussed in Chapter 8. Afterwards, the data can be recovered by means of im-
age processing [Hra06]. Nowadays, this data transmission technique is called optical cam-
era communication (OCC) [Sah15, Cah16, Bou16, Tel17, Ngu17b] or image sensor com-
munication (ISC). The image sensor can be modeled as a 2D photodetector array. More
than 10 million pixels are common in smartphones, providing a high spatial resolution at
low cost. Although still or moving pictures are commonly used in OCC, in some applica-
tions just a single LED serves as a data source. Despite these variations at the transmitter
side, all OCC schemes have in common that a 2D image sensor (usually a camera) serves
as photodetector at the receiver side. Among the advantages of this concept is the ease of
market entry: smartphones are ubiquitous.
We distinguish between screen-to-camera, display-to-camera, and LED-to-camera links.
Screen-to-camera communication offers, perhaps, the widest range of use cases in the field
of camera-based communication. Possible applications are the VLC services already dis-
cussed in Chapter 1, plus visible light positioning (VLP) to be introduced in Chapter 11.
Interesting examples of screen-to-camera links are phone-to-phone connections for mes-
sage and file transfer via short distances, and in-flight infotainment. Additional use cases
benefit from displays. For example, digital signature verification is easy. Also, modern
household appliances are equipped with a display, enabling contactless monitoring. LED-
to-camera communication is frequently discussed in context of car-to-X communication
[Cah16], furthermore low-cost IoT applications can also be supported by this technique
Camera-based communication is very different compared to conventional IM/DD sig-
naling from a channel modeling point of view [Ngu17b]. Communication is almost
interference-free. Therefore, the signal-to-interference-plus-noise ratio is typically high.
However, any kind of movement may have a significant impact on data recovery, due
to changes of relative distance and orientation. Image sensors behave differently in in-
door and outdoor environments with respect to ambient light. Sensors produced for the
mass-market employ low-cost optics that cause a variety of imperfections including color
inaccuracies. Consumer devices are subject to remarkably different characteristics among
each other. Particularly at low picture refresh rates flicker effects will be observed, unless
flicker is taken into account in the modulation design. For these reasons, the design of
robust, problem-oriented modulation and channel coding schemes is important. Next,
four classes of OCC techniques are presented. Their main characteristics are summarized
in Table 4.1 [Ngu17b].
Table 4.1 Characteristics of OCC techniques under investigation.

Global Shutter Rolling Shutter RoI Signaling Hybrid OC/PD

Screen Tx LED Tx Car/traffic light LED Car/traffic light LED
Camera Rx Rolling shutter Rx RoI camera Rx Hybrid image sensor
1 kbps - 1 Mbps 1 kbps 10 kbps 100 Mbps
Several meters Several meters Hundred meters Hundred meters

4.8.1 Global-Shutter Sampling

In the classical OCC setup, the transmitter generates a series of data-dependent pixelated
images [Hra06]. Potential images representing data are barcodes. All barcodes are machine
readable. Some represent just raw data, others even permit error correction and/or data
encryption. A popular 2D barcode is the QR code [QR]. The QR code (QR stands for quick
response) has been announced in 1994, and soon has been adopted by the Japanese car
industry in order to label automotive parts. QR codes provide a significantly improved
storage capacity compared to 1D barcodes. The original QR code is organized as a square
matrix. Three of the four corners carry synchronization patterns. These “eyes” are used
for alignment and positioning. The raw or encoded data is represented by black and white
squares distributed in the remaining part of the matrix. Each square contains one data
symbol. In the remainder, one matrix (i.e., one two-dimensional QR codeword) is referred
to as a frame. The number of squares per matrix is scalable, see Fig. 4.39. On the left-hand
side, an ASCII text comprising 12 characters is QR encoded, whereas on the right-hand
side an ASCII text of length 989 is encoded. QR codes are so popular because they can be
scanned by any smartphone equipped with a QR code reader app. A huge variety of QR
code readers is available on the web. Customarily, QR codes encode information like URL
links, geo coordinates, and text messages.

Figure 4.39 QR code of short text message (left) and long text message (right).

In the meantime, the conventional QR code has been extended towards several directions.
3D QR codes employ colored squares. The more colors are distinguishable, the more bits
can be stored per matrix. Alternatively, B&W squares with different intensities may be used.
Preferably, the intensities should be Gray coded. 4D QR codes additionally are time vary-
4.8 Camera-Based Communication 117

ing. The data content is changing from frame to frame, whereas the sync patterns are fixed
in order to ease synchronization. 4D QR codes are also called animated QR codes. This
type of code is particularly suitable for image-sensor-based data transmission.
In practice, image processing has to be performed at the receiver side in order to com-
pensate distortions before the data can be recovered. Concerning misalignment, the re-
ceived image needs to be scaled and/or equalized if the distance and/or orientation be-
tween transmitter and receiver is uncertain. Additionally, the received image typically is
blurred, i.e., there is cross-talk between neighboring pixels. There are several reasons for
blurring. The image planes at transmitter and receiver side usually are neither parallel nor
slanted towards their centers, or the optics is out of focus. Furthermore, in mass products,
neither the image nor the image sensor are perfect. For example, the source may not be
able to display colors well. A gradual reduction of the brightness towards the edges of the
received image is called vignetting. Also, the Moiré effect may distort the received image.
Another challenge are rolling shutters due to possible timing inaccuracies. All problems ad-
dressed here can be mitigated by image processing. Towards this goal, the synchronization
patterns embedded in the QR code are helpful. To start with, a global shutter is assumed
unless mentioned otherwise. Signal design for rolling shutters is treated separately.
Data recovery is a pattern recognition task. This is simplest for B&W QR codes. Upon
successful compensation of distortions, the black fields need to be identified. The achiev-
able data rate depends on the resolution, the cardinality of the symbol alphabet, and on
the frame rate. The original QR code (Model 1) and improvements thereof (Model 2) are
defined in 40 different versions. The minimum size, 21 × 21 matrix elements, is offered by
Version 1. 152 bits can be represented by one frame at the lowest error protection level L.
Version 40 is of size 177 × 177. At error protection level L, up to 23648 bits can be stored in
a single matrix [QR].
In the case of colored codes and codes employing different intensities, precise pattern
recognition is more difficult. However, more bits can be stored per frame and therefore
higher data rates can be achieved in OCC. Hence, there is a trade-off between detectability
and storage capability/data rate.
Both resolution (in terms of pixels per row and column and the number of bits per pixel)
and frame rate (determined by the refresh rate of the display and the capture rate of the
camera) are equipment-dependent. Currently, low-frame-rate smartphone cameras typi-
cally support about 30 fps (frames per second), whereas high-frame-rate vehicular cameras
used for autonomous driving may have a rate on the order of 1000 fps and beyond. In Ta-
ble 4.2, some examples are given for still pictures and moving images. Color pictures are as-
sumed to be encoded with 3 bits/pixel. With 3 bits, the three RGB primaries, the three YCM
primaries, as well as black and white can be addressed. (Alternatively, squares with eight
different intensities could be used instead of eight colors, which yields the same amount of
information.) For moving images, a display refresh rate of 15 fps is assumed. At 15 fps, the
capture-rate criterion is satisfied if the capture rate of the camera is 30 fps or more. (The
capture rate of the camera should be at least two times the refresh rate of the display for
correct sampling of consecutive frames in time, unless there is a synchronization unit acti-
vating the camera shutter. Furthermore, according to the sampling criterion each pixel of
the image shown on the display should be sampled by two or more pixels in the camera.)
The maximum achievable data rates reported in Table 4.2 are based on the lowest error
118 4 Modulation Schemes for Optical Wireless Communications

achievable data rate exceeds 1 Mbps at QR code Version 40. In the in-flight experiments
conducted in [Fat14], however, it has been indicated that Version 16 seems to be a prac-
tical limit even for high-end smartphone optics. In the meantime, numerous teams have
obtained data rates of several hundred kbps for distances below 1 m. Most use four colors.
If high-speed high-quality cameras would be used instead, much higher throughput could
be realized though, since the achievable data rate is proportional to the frame rate.

Table 4.2 Maximum achievable data rates for camera-based data transmission utilizing QR

Still B&W picture V1 V16 V40

152 bits/frame 4712 bits/frame 23648 bits/frame
Still color picture V1 V16 V40
456 bits/frame 14136 bits/frame 70944 bits/frame
Moving B&W images V1 V16 V40
2.28 kbps 70.68 kbps 354.72 kbps
Moving color images V1 V16 V40
6.84 kbps 212.04 kbps 1.064 Mbps

Without loss of generality, a classical QR code has been suggested so far. Several attempts
have been published to improve the error correction capabilities of QR codes, see for exam-
ple [Fat14]. Additionally, for the special purpose of optical camera communication, special
visual encoding designs have been proposed. The most popular ones perhaps are PixNet
[Per10], COBRA [Hao12], and RDCode [Wan14a]. PixNet is based on OFDM. An advantage
of OFDM is that degradation caused by blurring and vignetting can be limited by a cyclic
prefix [Mon14a], similar to ISI avoidance in RF communications. COBRA uses a novel 2D
barcode that has been optimized for real-time streaming of data. RDCode is a robust dy-
namic barcode which enables a packet-frame-block structure. Based on the layered struc-
ture, different error correction schemes are designed at three levels: intra-blocks, inter-
blocks and inter-frames, in order to recover lost blocks and frames. SoftLight provides an
efficient rateless coding scheme for the task of error protection [Du17]. This channel cod-
ing scheme is compatible with any visual coding scheme.

4.8.2 Rolling-Shutter Sampling

Heretofore, a global shutter has been assumed, i.e., the whole frame is captured simulta-
neously. For ease of implementation, however, many cameras are equipped with a rolling
shutter instead. Rolling-shutter-based cameras conduct a row-wise (or column-wise) ex-
posure process when taking pictures. In OCC, this effect permits data rates that exceed the
frame rate of the camera.
The principle of a rolling shutter is depicted in Fig. 4.40. Consider a binary light source with
two possible states, “on” or “off”. When a light source flickers at a frequency on the same
order of magnitude as the inverse of the shutter speed, layers of dark and bright stripes will
be recorded [Dan12]. Consequently, the original data can be extracted from these dark and
bright stripes by image processing [Luo15].
4.8 Camera-Based Communication 119

On Off Off On

Figure 4.40 Rolling-shutter principle given a binary light source. Time axis is from left to right.
The top part shows the data-dependent state of the light source. The bottom part illustrates the
temporal development of the captured image.

While a smartphone camera is limited to a capture rate of approximately 30 fps, the rolling
shutter effect allows capturing multiple information bits (LED states) inside every frame,
which boosts the data rate. (For example, in our simple sketch six bits are captured per
frame.) In accordance with this goal, different modulation schemes have been invented.
In [Dan12], OOK in conjunction with Manchester coding is suggested. Manchester coding
maps data bit 0 onto the pattern “01”, whereas data bit 1 is mapped onto “10”. Conse-
quently, the encoded sequence is always DC-balanced. Given a single-LED light source,
data rates in the kbps range are reported. In [Ngu16], a special data frame structure is pre-
sented. This scheme supports different frame rates, shutter speeds, sampling rates, and
A novel visible light communication method which consists of a high-speed sampling
method called line-scan sampling (LSS) and modulation schemes designed for LSS are
proposed in [Aoy15]. LSS utilizes the line scan characteristics of CMOS image sensors
and enables high-speed sampling that is a thousand times faster than image frame-based
sampling using conventional smartphones. The modulation schemes compensate for
shortcomings of LSS and enable visible light communication without perceptible flicker
using both current control and PWM control.
Several modulation schemes matched to the rolling-shutter problematic based on fre-
quency shift keying (FSK) are published in [Lee15, Hon17].

4.8.3 Region-of-Interest Signaling

For flicker avoidance, the “on/off” period of the transmitted signal must be shorter than the
maximum flickering time period, as discussed in Chapter 2. Applied to OCC, the frame rate
should exceed a critical value, typically about 100 fps. However, most consumer cameras
are low-frame-rate cameras. They are simply not fast enough to capture all data symbols
that are transmitted at frame rates beyond 100 fps.
In order to achieve flicker-free data transmission that can be captured by a low-frame-
rate camera, special modulation schemes have been proposed, namely undersampled
frequency-shift on-off keying (UFSOOK) [Rob13], undersampled phase-shift on-off key-
ing (UPSOOK) [Luo15], and spatial 2-PSK (S2-PSK) [Ngu17a], respectively. In all three
methods, LED light sources are assumed. At the receiver side, a single low-frame-rate
camera is sufficient. Furthermore, in all three techniques the achievable data rate is on
the order of 10 bps and hence quite low. The corresponding data stream is called low-rate
stream subsequently. UFSOOK and UPSOOK are temporal undersampling approaches that
demodulate a bit from two adjacent frames captured at different sampling times. S2-PSK
applies spatial undersampling that detects a bit entirely within a frame.
In summary, UFSOOK, UPSOOK and S2-PSK tackle the flicker problem, but they provide
only low-rate streams. The trick now is to perform region-of-interest (RoI) signaling. Con-
ceptually, the RoI signaling technique conducts a simultaneous transmission of two classes
of data streams: (i) a low-rate stream is used to detect the RoI, and (ii) a high-rate stream
is transferred via the selected RoI [Ngu17b]. This is performed by embedding the two data
streams in a clever way at the transmitter side, and by using a RoI camera that utilizes the
detected RoI to accelerate the frame rate and to demodulate the main data at high rate.

8 cycles 7 cycles
bit 0 bit 1
Tb 2Tb t
UPSOOK LED bit 0 bit 1

S2-PSK LED1 bit 0 LED1 bit 1

t t
LED2 bit 0 LED2 bit 1

t t
Figure 4.41 Exemplary UFSOOK, UPSOOK, and S2-PSK waveforms.

Let us first concentrate on the low-rate stream, before moving on to the high-rate stream.
In Fig. 4.41, UFSOOK, UPSOOK, and S2-PSK waveforms are depicted in order to explain
their generation. In UFSOOK, data bit 0 is represented by the space frequency and data
bit 1 by the mark frequency. Let the frame rate (in Hz) be denoted as F fps and let n be
a non-negative integer. According to [Rob13], the space frequency is defined as f space =
n · F fps , whereas the mark frequency is taken as f mark = (n − 0.5) · F fps . In the top part of
Fig. 4.41, F fps = 30 Hz in conjunction with n = 4 is selected, leading to f space = 120 Hz and
f mark = 105 Hz. When considering 2n = 8 cycles in order to transmit a single data bit at
the space frequency and 2(n − 0.5) = 7 cycles at the mark frequency, respectively, the bit
duration will be equally long (Tbit = 66.67 ms in our example) – independent of the data
sequence. Notice that although 1/Tbit = 15 Hz is half of the frame rate, flicker is avoided
since the critical frequency is 105 Hz. But how can data be recovered? According to the
selected parameters, one bit duration is equidistantly sampled by two camera frames. In
the case that bit 0 has been transmitted, the magnitudes of the received samples of both
frames together are high or low. This is due to the fact that any cycle corresponds to a “10”
OOK pattern. However in the case that bit 1 has been sent, one of the received samples is
at high level, the other one at low level. The core UFSOOK design is based on a single LED
4.8 Camera-Based Communication 121

at the transmitter side. For further generalizations of UFSOOK, including multiple LED
transmitters (i.e., MIMO aspects) and dimming, the interested reader may refer to [Rob13].
In the middle part of Fig. 4.41, a typical waveform for UPSOOK is plotted. Conceptually, fre-
quency shift keying is replaced by phase shift keying. For ease of comparison, a frequency
of 120 Hz is assumed, corresponding to 8 cycles per bit duration. At a frame rate of 30 Hz,
again one bit is sampled by two adjacent frames: the RoI bit rate is limited by half of the
frame rate. In the absence of synchronization between transmitter and receiver, the sam-
pling phase is random. Depending on the sampling phase, the magnitudes of the received
samples of both frames are both either high or low. This holds true for both logical levels.
Whenever two adjacent data bits are different, however, the order of the levels is toggling.
For further generalizations of UPSOOK, including MIMO aspects and WDM, the interested
reader is referred to [Luo15]. Potential use cases of UFSOOK and UPSOOK are smart traffic
signs and traffic lights.
S2-PSK is tailored to car-to-X communication based on vehicles with two front-light LEDs
or two rear-light LEDs, respectively. Flicker avoidance is a mandatory prerequisite in this
type of application, but also low-cost implementations. For example, bit 0 is transmitted
via waveforms having the same phase, whereas bit 1 is transmitted through two inverse-
phase waveforms, see the bottom part in Fig. 4.41. Because the data bit can be recovered in
a single frame, S2-PSK is less sensitive with respect to acceleration. Also, S2-PSK overcomes
the problem of time-varying frame rates and different types of shutters. However, there is
neither a diversity gain (since none of the LEDs is allowed to be blocked) nor a multiplex-
ing gain (because LED1 does not carry any information). In [Ngu17a], advanced receive
processing is proposed and analyzed, providing a robust but low-rate optical transmission
scheme in harsh environments.
Recall that the low-rate stream carries the RoI information. The transmission of RoI is use-
ful in order to regularly notify the receiver about the location of the desired light source
in the captured image. Via the RoI signal the detector is able to discriminate the desired
light source from irrelevant light sources and other objects. The transmission of the known
signal is a type of light source identification [Ngu17b].
Now, we move on to the high-rate stream. The high-rate data stream is embedded into
the low-rate RoI stream. A modulation scheme called twinkle VPPM is matched to UF-
SOOK, actually it is a combination of UFSOOK and VPPM [Ngu17b]. The high-rate stream
is VPPM modulated. In VPPM, the duty cycle (and therefore the brightness) can be con-
trolled. Controlling the duty cycle is performed in a low-frequency manner by UFSOOK,
i.e., by the low-rate RoI stream. Effectively, the intensity of the light wave is changed slowly
while VPPM data is transmitted at a high rate – this is the main data. The LED appears
to twinkle – this is used as a beacon to identify LEDs that are carrying high-rate data. A
combination of UPSOOK and VPPM is along the same lines. The counterpart matched to
S2-PSK is dubbed hybrid spatial phase-shift keying (HS-PSK), but the roots are the same.
As a result, data rates on the order of 10 kbps have been reported for the 10-100 m distance
range [Ngu17b]. A more recent development is offset variable pulse width modulation
(Offset-VPWM) [Ngu18], where PWM is used instead of PPM.
In the 2018 release of the IEEE 802.15.7 standard, three novel physical layer specifications
have been included [IEEE802]. All novel modes, called PHY IV, PHY V and PHY VI, are image
sensor communication modes [Ngu18]. In the PHY IV mode, UFSOOK, Twinkle VPPM,
S2-PSK, HS-PSK as well as Offset-VPWM are specified. The PHY V mode employs rolling
shutter frequency shift keying (RS-FSK), camera m-ary frequency-shift keying (CM-FSK),
camera on-off keying (C-OOK), and mirror pulse modulation (MPM). PHY VI mode is
based on asynchronous quick link (A-QL), variable transparent amplitude-shape-color
(VTASC), sequential scalable two-dimensional color (SS2DC), invisible data embedding
(IDE), and hidden asynchronous quick link (HA-QL) technologies.

4.8.4 Hybrid Camera-Based Photodetector-Based Systems

The low frame rate of common CMOS and CCD image sensors is the main shortage of
camera-based communications. In [Tak13], image sensors with integrated photodetec-
tor cells have been invented as a possible solution to avoid this main disadvantage. This
technology enables hybrid camera-based photodetector-based systems. The output of the
image sensor is used to detect the light source, whereas high-speed data transmission is
handled by the fast photodetector cells. The hybrid technique is currently applied in the
automotive domain [Tak14, Got16]. Impressive data rates on the order of 50 Mbps in the
10-100 m range have been achieved in experiments [Got16].

4.9 Chapter Summary

The task of a digital modulator is to convert a bit stream into an analog waveform. Light
emitted by LEDs is noncoherent. Therefore, intensity modulation is the only choice.
Intensity-modulated waveforms are non-negative and real-valued. Further restrictions
include eye safety, peak power, flicker, dimming, and color quality constraints. Focus
has been on photodetector-based direct detection, but also camera-based detection con-
cepts have been treated. About eighty different intensity modulation schemes have been
addressed in this chapter at different levels of detail, including single-carrier and multi-
carrier modulation schemes, color-domain modulation techniques, and pixelated light
Many single-carrier intensity modulation schemes are linear modulation schemes (e.g.,
OOK, ASK, PAM), with a few exceptions (like PPM and PWM). These and other modula-
tion schemes are compared with respect to power and spectral efficiencies. Particularly
binary modulation schemes with square waveforms are hardware-friendly, because the
driver hardware is simple and power efficient, and a threshold-type of detector is sufficient.
However, color control usually is difficult with classical modulation schemes.
This drawback can be solved by color-domain modulation, including CSK, DCSK, CIM,
MM, and GCM. In metameric modulation schemes, light intensity changes are not visible
by the human eye. A recent development is deep learning, borrowed from AI, in order to
jointly optimize the modulator as well as the demodulator.
Multi-carrier modulation schemes are attractive, because additional degrees of freedom
can be exploited. In multipath environments and/or at high data rates, intersymbol in-
terference can be compensated by a cyclic prefix. Bandwidth efficiency can be boosted
by the water-filling principle. Starting off from OFDM, a real-valued waveform is com-
4.9 Chapter Summary 123

monly obtained by using Hermitian symmetry. DMT, DCO-OFDM, PAM-DMT, ACO-

OFDM and Flip-OFDM are well-established examples. Modern variations like SEE-OFDM,
hanced. Additionally, hybrid schemes exist. Alternatives to DMT and variations thereof are
multi-frequency carrierless amplitude and phase modulation (CAP) and carrierless OFDM
(cOFDM). Carrierless OFDM has not yet been published elsewhere.
Code-division multiplexing (CDM) allows for superimposing data sequences. Hence, dim-
ming is simple and multiple users can be supported (CDMA). Superposition modulation
(SM) is a generalization of CDM. SM is based on the discrete power level stepping concept,
where intensities are superimposed. A recent development, called constrained superposi-
tion intensity modulation (CSIM), takes rise and fall times of the light sources into account.
The target is minimize the average number of switching operations per information bit in
order to reduce switching losses.
Last but not least camera-based communication is studied. A point light source is replaced
by a pixelated light source, for instance a display. An ordinary smartphone may serve as
the detector. Because ambient light is easier to handle compared to photodetector-based
direct detection, camera-based communication is suitable for outdoor applications as well.
Speed is limited, however, since the achievable data rate is proportional to the frame rate.
For this reason, the rolling shutter effect should be exploited. Towards this goal, several
modulation schemes have been reported.

4-1 Let us focus on intensity modulation (IM) in conjunction with direct detection (DD).
(a) Which constraints need to be considered in optical wireless communications
(OWC) concerning the waveform design conducted in the modulator?
(b) Suppose the data symbols are bipolar, like in pulse amplitude modulation (PAM).
Which techniques can be applied to obtain intensities at the transmitter output?
(c) Is there a related problem if the pulse shaping causes positive and negative val-
ues, such as in the case of a raised-cosine pulse?
(d) Compare the complexities of direct detection and coherent detection.
4-2 Equation (4.1) defines the important class of linear modulation schemes.
(a) What is the impact of the symbol constellation on the bit error performance?
(b) What is the impact of the pulse shaping on the signal bandwidth?
(c) Design your own individual modulation scheme. Hint: You may derive the sym-
bol constellation from your initials. Think about individualizing the pulse shap-
ing as well.
4-3 On-off keying (OOK) is a widespread intensity modulation scheme.
(a) Let us assume a rectangular baseband pulse first. Compute and compare the bit
error rate (BER) for NRZ-OOK and for RZ-OOK.
124 4 Modulation Schemes for Optical Wireless Communications

(b) Now, the rectangular pulse is replaced by a Gaussian baseband pulse. The stan-
dard deviation in time domain is assumed to be one tenth the symbol duration
in order to neglect intersymbol-interference. Compute the BER and compare it
with OOK.
(c) Compute the time-bandwidth product ∆T · ∆B for all three baseband pulses.
Hint 1: Computations simplify if you assume non-causal symmetric baseband
R∞ R∞
pulses. Hint 2: ∆T = g 1(0) −∞ g Tx (t ) d t , ∆B = G 1(0) −∞ G Tx ( f ) d f , where
Tx Tx
G Tx ( f ) is the Fourier spectrum of the deterministic pulse g Tx (t ).
4-4 Now, we investigate the influence of the nonlinear I F vs. VF characteristic of an LED
on amplitude shift keying (ASK). Assume that the intensity is linear with respect to the
forward current of the LED. Furthermore, assume that the relation between forward
current, I F , and forward voltage, VF , is given by Shockley’s formula I F = I 0 e VF /V0 − 1 ,
¡ ¢

where I 0 and V0 are constants. Suppose in this exercise that the modulator outputs a
voltage rather than a current.
(a) Calculate the intensity levels for the case of unipolar 4-ASK. Which problem
arises with respect to the error performance?
(b) Optimize the unipolar 4-ASK constellation so that the intensity levels are uni-
formly spaced.
(c) Does the optimization affects the bandwidth efficiency?
4-5 Pulse position modulation (PPM) is a popular modulation scheme in OWC.
(a) Give reasons for that.
(b) In the section on variable pulse position modulation (VPPM), an example of 2-
VPPM is sketched for different dimming levels. Repeat this design rule for 4-
4-6 Carrierless amplitude and phase modulation (CAP) based multi-band transmission is
an alternative to orthogonal frequency-division multiplexing (OFDM).
(a) What are the differences and commonalities of CAP-based multi-band transmis-
sion and OFDM?
(b) What are the advantages compared to single-carrier modulation schemes?
4-7 Color-domain modulation schemes exploit the additional degree of freedom of color
(a) Let us consider 4-ary color shift keying (4-CSK) in conjunction with an isosce-
les color gamut. The four symbols of the 4-CSK constellation are assumed to be
equally likely. Design the primaries such that the centroid of the gamut corre-
sponds to the white point [0.33, 0.33]. Sketch all possible solutions.
(b) Now, we assume the 4-CSK bit labeling according to Fig. 4.11. What happens, if
the source bits are not uniformly distributed? For example, assume the following
distribution: P ([00]) = P ([01]) = 0.1, P ([10]) = P ([11]) = 0.4.
(c) What is the difference between CSK and digital color shift keying (DCSK)?
(d) What is the superiority of metameric modulation (MM)? How does MM work?
4-8 The most prominent multi-carrier modulation scheme is OFDM. In OWC, however,
some modifications are needed.
(a) Compare DCO-OFDM, PAM-DMT, and ACO-OFDM.
References 125

(b) What is the relation between Hermitian symmetry and quadrature modulation?
4-9 Code-division multiplexing (CDM) is based on spreading sequences.
(a) Discuss commonalities and differences between CDM and PPM.
(b) Try to design orthogonal sequences of length K = 8. Choose a fixed weight W > 1
of your choice.
(c) CDMA supports several users. How can CDM be generalized to become CDMA?
4-10 Superposition modulation (SM) is a hardware-friendly modulation scheme.
(a) Why?
(b) In ordinary SM, baseband pulses of all layers are synchronized at the transmit-
ter side. In constrained superposition intensity modulation (CSIM), however,
sequences are asynchronous. What is the benefit of time shifts?
4-11 Camera-based communication is possible with smartphones.
(a) What are the pros and cons of pixelated communication?
(b) Explain the influence of a rolling shutter.

You might also like