Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

31/8/2014 Digital television - Wikipedia, the free encyclopedia

http://en.wikipedia.org/wiki/Digital_television 1/10
Digital television
From Wikipedia, the Iree encyclopedia
Digital television (DTV) is the transmission oI audio and video by digitally processed and multiplexed signal, in
contrast to the totally analog and channel separated signals used by analog television. Digital TV can support more
than one programme in the same channel bandwidth.
|1|
It is an innovative service that represents the Iirst signiIicant
evolution in television technology since color television in the 1950s.
|2|
Many countries are replacing broadcast
analog television with digital television and allowing other uses oI the television radio spectrum. Several regions oI
the world are in diIIerent stages oI adaptation and are implementing diIIerent broadcasting standards. There are Iour
diIIerent widely used digital television terrestrial broadcasting standards (DTTB):
Advanced Television System Committee (ATSC) uses eight-level vestigial sideband (8VSB) Ior terrestrial
broadcasting. This standard has been adopted by six countries, United States, Canada, Mexico, South
Korea, Dominican Republic and Honduras.
Digital Video Broadcasting-Terrestrial (DVB-T) uses coded orthogonal Irequency-division multiplexing
(OFDM) modulation and supports hierarchical transmission. This standard has been adopted in Europe,
Australia and New Zealand.
Terrestrial Integrated Services Digital Broadcasting (ISDB-T) is a system designed to provide good
reception to Iix receivers and also portable or mobile receivers. It utilizes OFDM and two-dimensional
interleaving. It supports hierarchical transmission oI up to three layers and uses MPEG-2 video and
Advanced Audio Coding. This standard has been adopted in Japan and the Philippines. ISDB-T
International is an adaptation oI this standard using H.264/MPEG-4 AVC that been adopted in most oI
South America and is also being embraced by Portuguese-speaking AIrican countries.
Digital Terrestrial Multimedia Broadcasting (DTMB) adopts time-domain synchronous (TDS) OFDM
technology with a pseudo-random signal Irame to serve as the guard interval (GI) oI the OFDM block and
the training symbol. The DTMB standard has been adopted in the People's Republic oI China, including
Hong Kong and Macau.
|3|
Contents
1 Technical inIormation
1.1 Formats and bandwidth
1.2 Receiving digital signal
1.3 Protection parameters Ior terrestrial DTV broadcasting
1.4 Interaction
1.5 1-segment broadcasting
2 Comparison analog vs digital
31/8/2014 Digital television - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Digital_television 2/10
2.1 EIIect on existing analog technology
2.2 Disappearance oI TV-audio receivers
2.3 Environmental issues
3 Technical limitations
3.1 Compression artiIacts and allocated bandwidth
3.2 EIIects oI poor reception
4 New borders in the world
5 Timeline oI transition
6 See also
7 Notes and reIerences
8 Further reading
9 External links
Technical information
Formats and bandwidth
Digital television supports many diIIerent picture Iormats deIined by the broadcast television systems which are a
combination oI size, aspect ratio (width to height ratio).
With digital terrestrial television (DTT) broadcasting, the range oI Iormats can be broadly divided into two
categories: high deIinition television (HDTV) Ior the transmission oI high-deIinition video and standard-deIinition
television (SDTV). These terms by themselves are not very precise, and many subtle intermediate cases exist.
Television pictures have diIIering amounts oI deIinition (rendering oI Iine detail) according to how many individual
picture elements are provided to reconstruct the picture. This deIinition is expressed as the number oI horizontal
lines and picture elements (pixels) in each line that are used Ior diIIerent Iormats. Thus when we say a Iormat is
640 480p we mean there are 640 elements in each oI 480 horizontal lines (scanned progressively) Ior a total oI
307,200 pixels and an aspect ratio oI 640480 or 4:3 (4 units wide by 3 units high) or SDTv.
One oI several diIIerent HDTV Iormats that can be transmitted over DTV is: 1280 720 pixels in progressive scan
mode (abbreviated 720p) or 1920 1080 pixels in interlaced video mode (1080i). Each oI these uses a 16:9
aspect ratio. (Some televisions are capable oI receiving an HD resolution oI 1920 1080 at a 60 Hz progressive
scan Irame rate known as 1080p.) HDTV cannot be transmitted over current analog television channels because
oI channel capacity issues.
Standard deIinition TV (SDTV), by comparison, may use one oI several diIIerent Iormats taking the Iorm oI various
aspect ratios depending on the technology used in the country oI broadcast. For 4:3 aspect-ratio broadcasts, the
640 480 Iormat is used in NTSC countries, while 720 576 is used in PAL countries. For 16:9 broadcasts, the
720 480 Iormat is used in NTSC countries, while 720 576 is used in PAL countries. However, broadcasters
may choose to reduce these resolutions to save bandwidth (e.g., many DVB-T channels in the United Kingdom use
a horizontal resolution oI 544 or 704 pixels per line).
|4|
10 TV and allied non-linear systems
All TV standards use non-linear signals, pre-corrected for the non-linear transfer characteristic
of the display CRT. It is here that the most confusion exists, and so this is a VERY important
section to understand.
A typical CRT has a non-linear voltage-to-light transfer function with a power law usually
denoted by gamma. The value of gamma is theoretically 2.5, but is specied as 2.2 in NTSC
systems, 2.8 in PAL systems, and is actually nearer to 2.35 for real CRTs. Any signal destined
for display on a CRT must be distorted by an inverse law. In practice, that is impossible because
a pure power law has innite slope (gain) at zero (black). TV systems limit the gain near black
to a value between 4 and 5 by offsetting the power law. This has the side advantage of increasing
saturation in a way that compensates for the display having a dark surround. For example the
ITU-BT.709 specication is:-
Volts =

(1 +a)Light
law
a if Light > b
slope Light for Light b
(70)
where a=0.099, law=0.45, b=0.018.
and the gain at zero is 4.5. This law is similar to the formula used for L* (see above).
So for accurate colour calculations, this law (or whichever law was actually applied) must
be undone to return to linear signals before doing conversions. The law should be reapplied to
the results to get the drive signals for the actual display.
A signal that has been gamma-corrected is shown primed (Y, R, G, B etc.). In general,
undoing the gamma law will return to linear signals, but that is not always true, especially with
the Y signal, which is not directly related to the CIE Y value. It is a shame that the TV industry
used Y for the luminance channel, because it created a great deal of confusion, most of which
still exists. But careful reading of the following section shows the way to performing totally
accurate colour calculations using any colour system.
10.1 European YUV (EBU)
European TV (PAL and SECAM coded) uses YUV components. Y is similar to perceived
luminance, U and V carry the colour information and some luminance information and are
bipolar (they go negative as well as positive). The symbols U and V here are not related to the
U and V of CIE YUV (1960).
This coding is also used in some 525 line systems with PAL subcarriers, particularly in parts
of the Americas. The specication here is that of the European Broadcasting Union (EBU). Y
has a bandwidth of 5 MHz in Europe, 5.5 MHz in UK. The U and V signals usually have
up to 2.5 MHz bandwidth in a component studio system, but can be as little as 600 kHz or
less in a VHS recorder. U and V always have the same bandwidth as each other. The CRT
gamma law is assumed to be 2.8, but camera correction laws are the same as in all other systems
(approximately 0.45). The system white point is D65, the chromaticity co-ordinates are:
18
R: xr=0.64 yr=0.33
G: xg=0.29 yg=0.60
B: xb=0.15 yb=0.06
White: xn=0.312713 yn=0.329016
The conversion equations for linear signals are:-
X = 0.431 R + 0.342 G+ 0.178 B
Y = 0.222 R + 0.707 G+ 0.071 B (71)
Z = 0.020 R + 0.130 G+ 0.939 B
R = 3.063 X 1.393 Y 0.476 Z
G = 0.969 X + 1.876 Y + 0.042 Z (72)
B = 0.068 X 0.229 Y + 1.069 Z
the coding equations for non-linear signals are:
Y

= 0.299 R

+ 0.587 G

+ 0.114 B

= 0.493 (B

)
= 0.147 R

0.289 G

+ 0.436 B

(73)
V

= 0.877 (R

)
= 0.615 R

0.515 G

0.100 B

= Y

+ 0.000 U

+ 1.140 V

= Y

0.396 U

0.581 V

(74)
B

= Y

+ 2.029 U

+ 0.000 V

The conversion equations between linear 709 RGB signals (see later) and EBU RGB signals
are:
Re = 0.9578 R7 + 0.0422 G7 + 0.0000 B7
Ge = 0.0000 R7 + 1.0000 G7 + 0.0000 B7 (75)
Be = 0.0000 R7 + 0.0118 G7 + 0.9882 B7
R7 = 1.0440 Re 0.0440 Ge + 0.0000 Be
G7 = 0.0000 Re + 1.0000 Ge + 0.0000 Be (76)
B7 = 0.0000 Re 0.0119 Ge + 1.0119 Be
19
10.2 American YIQ
American TV (NTSC coded) uses YIQ components. Again Y is similar to perceived lumi-
nance, I and Q carry colour information and some luminance information and are derived by
rotating the UV vector formed by colour coding as described in section 3.1 by 33 degrees. The
Y signal usually has 4.2 MHz bandwidth in a 525 line system. Originally the I and Q signals
were to have different bandwidths (0.5 and 1.5 MHz) but they now commonly have the same
bandwidth (1 MHz). The coding is also used in some 625 line countries with NTSC subcarriers,
again mostly in the Americas. The CRT gamma law is assumed to 2.2. The system white point
is Illuminant C, the chromaticity co-ordinates are:-
R: xr=0.67 yr=0.33
G: xg=0.21 yg=0.71
B: xb=0.14 yb=0.08
White: xn=0.310063 yn=0.316158
The conversion equations for linear signals are:-
X = 0.607 R + 0.174 G+ 0.200 B
Y = 0.299 R + 0.587 G+ 0.114 B (77)
Z = 0.000 R + 0.066 G+ 1.116 B
R = 1.910 X 0.532 Y 0.288 Z
G = 0.985 X + 1.999 Y 0.028 Z (78)
B = 0.058 X 0.118 Y + 0.898 Z
The coding equations for non-linear signals are:-
Y

= 0.299 R

+ 0.587 G

+ 0.114 B

= 0.27 (B

) + 0.74 (R

)
= 0.596 R

0.274 G

+ 0.322 B

(79)
Q

= 0.41 (B

) + 0.48 (R

)
= 0.212 R

0.523 G

0.311 B

= Y

+ 0.956 I

+ 0.621 Q

= Y

0.272 I

0.647 Q

(80)
B

= Y

1.105 I

+ 1.702 Q

It is possible to dene a transformation matrix between EBU YUV and NTSC YIQ.
However, this only makes sense if the primaries are the same for the two systems, and clearly
they are dened differently. However, over the years, the American NTSC system has changed
its primaries several times until they are now very similar to those of the EBU systems. The
non-linear connecting equations are:
20
I

= (
0.27
0.493
)U

+ (
0.74
0.877
)V

= 0.547667343U

+ 0.843785633V

(81)
Q

= (
0.41
0.493
)U

+ (
0.48
0.877
)V

= 0.831643002U

+ 0.547320410V

(82)
and:
U

= 0.546512701 I

+ 0.842540416 Q

(83)
V

= 0.830415704 I

+ 0.546859122 Q

(84)
To all intents and purposes these equations are identical and so one practical set of equations
can be used in either direction:
I

= 0.547 U

+ 0.843 V

= 0.831 U

+ 0.547 V

(85)
U

= 0.547 I

+ 0.843 Q

= 0.831 I

+ 0.547 Q

The conversion equations relating NTSC RGB signals to EBU and 709 are:
R
ntsc
= 0.6984 R
ebu
+ 0.2388 G
ebu
+ 0.0319 B
ebu
G
ntsc
= 0.0193 R
ebu
+ 1.0727 G
ebu
0.0596 B
ebu
(86)
B
ntsc
= 0.0169 R
ebu
+ 0.0525 G
ebu
+ 0.8450 B
ebu
R
ebu
= 1.4425 R
ntsc
0.3173 G
ntsc
0.0769 B
ntsc
G
ebu
= 0.0275 R
ntsc
+ 0.9350 G
ntsc
+ 0.0670 B
ntsc
(87)
B
ebu
= 0.0272 R
ntsc
0.0518 G
ntsc
+ 1.1081 B
ntsc
and:
R
ntsc
= 0.6698 R
709
+ 0.2678 G
709
+ 0.0323 B
709
G
ntsc
= 0.0185 R
709
+ 1.0742 G
709
0.0603 B
709
(88)
B
ntsc
= 0.0162 R
709
+ 0.0432 G
709
+ 0.8551 B
709
R
709
= 1.5073 R
ntsc
0.3725 G
ntsc
0.0832 B
ntsc
G
709
= 0.0275 R
ntsc
+ 0.9350 G
ntsc
+ 0.0670 B
ntsc
(89)
B
709
= 0.0272 R
ntsc
0.0401 G
ntsc
+ 1.1677 B
ntsc
21
10.3 SMPTE-C RGB
SMPTE-Cis the current colour standard for broadcasting in America, the old NTSCstandard for
primaries is no longer in wide use because the primaries of the system have gradually shifted
towards those of the EBU (see section 6.2). In all other respects, SMPTE-C is the same as
NTSC. The CRT gamma law is assumed to be 2.2. The white point is now D65, and the
chromaticities are:
R: xr=0.630 yr=0.340
G: xg=0.310 yg=0.595
B: xb=0.155 yb=0.070
White: xn=0.312713 yn=0.329016
The conversion equations for linear signals are:
X = 0.3935 R + 0.3653 G+ 0.1916 B
Y = 0.2124 R + 0.7011 G+ 0.0866 B (90)
Z = 0.0187 R + 0.1119 G+ 0.9582 B
R = 3.5058 X 1.7397 Y 0.5440 Z
G = 1.0690 X + 1.9778 Y + 0.0352 Z (91)
B = 0.0563 X 0.1970 Y + 1.0501 Z
The coding equations for non-linear signals are the same as for NTSC:
Y

= 0.299 R

+ 0.587 G

+ 0.114 B

= 0.27 (B

) + 0.74 (R

)
= 0.596 R

0.274 G

+ 0.322 B

(92)
Q

= 0.41 (B

) + 0.48 (R

)
= 0.212 R

0.523 G

0.311 B

= Y

+ 0.956 I

+ 0.621 Q

= Y

0.272 I

0.647 Q

(93)
B

= Y

1.105 I

+ 1.702 Q

and the same conversion equations work between EBU and SMPTE-C components:
I

= 0.547 U

+ 0.843 V

= 0.831 U

+ 0.547 V

(94)
U

= 0.547 I

+ 0.843 Q

= 0.831 I

+ 0.547 Q

22
The conversion equations relating SMPTE-C RGB signals to EBU and 709 signals are:
R
smptec
= 1.1123 R
ebu
0.1024 G
ebu
0.0099 B
ebu
G
smptec
= 0.0205 R
ebu
+ 1.0370 G
ebu
0.0165 B
ebu
(95)
B
smptec
= 0.0017 R
ebu
+ 0.0161 G
ebu
+ 0.9822 B
ebu
(96)
R
ebu
= 0.9007 R
smptec
+ 0.0888 G
smptec
+ 0.0105 B
smptec
G
ebu
= 0.0178 R
smptec
+ 0.9658 G
smptec
+ 0.0164 B
smptec
(97)
B
ebu
= 0.0019 R
smptec
0.0160 G
smptec
+ 1.0178 B
smptec
and:
R
smptec
= 1.0654 R
709
0.0554 G
709
0.0010 B
709
G
smptec
= 0.0196 R
709
+ 1.0364 G
709
0.0167 B
709
(98)
B
smptec
= 0.0016 R
709
+ 0.0044 G
709
+ 0.9940 B
709
R
709
= 0.9395 R
smptec
+ 0.0502 G
smptec
+ 0.0103 B
smptec
G
709
= 0.0178 R
smptec
+ 0.9658 G
smptec
+ 0.0164 B
smptec
(99)
B
709
= 0.0016 R
smptec
0.0044 G
smptec
+ 1.0060 B
smptec
10.4 ITU.BT-601 YCbCr
This is the international standard for digital coding of TV pictures at 525 and 625 line rates.
It is independent of the scanning standard and the system primaries, therefore there are no
chromaticity co-ordinates, no CIE XYZ matrices, and no assumptions about white point or
CRT gamma. It deals only with the digital representation of RGB signals in YCbCr form.
The non-linear coding matrices are:
Y

= 0.299 R

+ 0.587 G

+ 0.114 B

Cb = 0.169 R

0.331 G

+ 0.500 B

(100)
Cr = 0.500 R

0.419 G

0.081 B

= Y

+ 0.000 U

+ 1.403 V

= Y

0.344 U

0.714 V

(101)
B

= Y

+ 1.773 U

+ 0.000 V

23
10.5 ITU.BT-709 HDTV studio production in YCbCr
This is a recent standard, dened only as an interim standard for HDTV studio production.
It was dened by the CCIR (now the ITU) in 1988, but is not yet recommended for use in
broadcasting. The primaries are the R and B from the EBU, and a G which is midway be-
tween SMPTE-C and EBU. The CRT gamma law is assumed to be 2.2. White is D65. The
chromaticities are:
R: xr=0.64 yr=0.33
G: xg=0.30 yg=0.60
B: xb=0.15 yb=0.06
White: xn=0.312713 yn=0.329016
The conversion equations for linear signals are:
X = 0.412 R + 0.358 G+ 0.180 B
Y = 0.213 R + 0.715 G+ 0.072 B (102)
Z = 0.019 R + 0.119 G+ 0.950 B
R = 3.241 X 1.537 Y 0.499 Z
G = 0.969 X + 1.876 Y + 0.042 Z (103)
B = 0.056 X 0.204 Y + 1.057 Z
The coding equations for non-linear signals are:
Y

= 0.2215 R

+ 0.7154 G

+ 0.0721 B

Cb = 0.1145 R

0.3855 G

+ 0.5000 B

(104)
Cr = 0.5016 R

0.4556 G

0.0459 B

= Y

+ 0.0000 Cb + 1.5701 Cr
G

= Y

0.1870 Cb 0.4664 Cr (105)


B

= Y

1.8556 Cb + 0.0000 Cr
The conversion equations between linear 709 RGB signals and EBU RGB signals are:
Re = 0.9578 R7 + 0.0422 G7 + 0.0000 B7
Ge = 0.0000 R7 + 1.0000 G7 + 0.0000 B7 (106)
Be = 0.0000 R7 + 0.0118 G7 + 0.9882 B7
R7 = 1.0440 Re 0.0440 Ge + 0.0000 Be
G7 = 0.0000 Re + 1.0000 G7 + 0.0000 Be (107)
B7 = 0.0000 Re 0.0119 Ge + 1.0119 Be
24
418 Chapter 8 Image Compression
EXAMPLE 8.3:
Comrrcsion by
quantization.
a b c
FIGURE 8.4
(a) Original
image.
(b) Uniform
quantization to 16
levels. (c) !GS
quantilation to 16
levels.
means the mapping of a broad range of input values to a limited number of
output values. as discussed in Section 2.4. As it is an irreversible operation (visual
information is lost). quantization results in lossy data compression.
Consider the images in Fig. 8.4. Figure K4(a) shows a monochrome image
with 256 possible gray levels. Figure 8.4(b) shows the same image after uniform
quantization to four bits or 16 possible levels. The resulting compression ratio
is 2: I. ote. as discussed in Section 2.4. that false contouring is present in the
previously smooth regions of the original image. This is the natural visual effect
of more coarsely representing the gray levels of the image.
Figure 8.4(c) illustrates the significant improvements possible with quanti
zation that takes advantage of the peculiarities of the human vi ual system. Al
though the compression ratio resulting from this second quantization procedure
also is 2: I. false contouring is greatly reduced at the expense of some additional
but less objectionable graininess. The method used to produce this result is
known as improved gray-scale (JGS) quanti:.ation. It recognizes the eye's in
herent sensitivity to edges and breaks them up by adding to each pixel a pseudo
random number. which is generated from the low-order bits of neighboring
pixels. before quantizing the result. Because the low-order bits arc fairly random
(sec the bit planes in Section 3.2.4). this amounts to addihg a level of random
ness. which depends on the local charactcrist ies of the image. to the artificial
edges normally associated with false contouring.
Table 8.2 illustrates this method. A sum-initially set to zero-is first formed
from the current 8-bit gray-level value and the four least significant bits of a
previously generated sum. If the four most significant bits of the current value
arc I 1112 however. 00002 is added instead. The four most significant bits of the
resulting sum are used as the coded pixel value.
8.1 Fundamentals 419
Pixel Gray Level Swn IGS Code
i- I N/A 000000 N/A
i OliO 1100 0110110 011.
i + I 101011 10010111 1001
i + 2 10000111 101110 10
i + 3 JlllOIOO llliOIOO 1111
improved gray-scale quantization is typical of a large group of quantization
procedures that operate directly on the gray levels of the image to be com
pressed. They usually entail a decrease in the image's spatial and/or gray-scale
resolution. Te resulting false contouring or other related effects necessitates the
use of heuristic techniques to compensate for the visual impact of quantization.
The normal 2: 1 line interlacing approach used in commercial broadcast televi
sion, for example, is a form of quantization in which interleaving portions of
adjacent frames allows reduced video scanning rates with little decrease in
perceived image quality.
8.1.4 Fidelity Criteria
As noted previously, removal of psychovisually redundant data results in a loss
of real or quantitative visual information. Because information of interest may
be lost, a repeatable or reproducible means of quantifying the nature and ex
tent of information loss is highly desirable. Two general classes of criteria are
used as the basis for such an assessment: (1) objective fidelity criteria and
(2) subjective fidelity criteria.
When the level of information loss can be expressed as a function of the orig
inal or input image and the compressed and subsequently decompressed output
image, it is said to be based on an objective fidelity criterion. A good example is
the root-mean-square (rms) error between an input and output image. Let
f
(x. y)
represent an input image and let ](x, y) denote an estimate or approximation
of f(x, y) that results from compressing and subsequently decompressing the
input. For any value of x andy, the error e(x. y) between f(x, y) and ](x, y) can
b defined as
e(x,y) = f
(x
,y)- f
(x
,
y)
so that the total error between the two images is
.
L L [J(x, y) - f
(x
, y)]
r;Q y;O
(8.1-7)
where the images are of size M X N. The root-mean-square error,
erm
s,
between
f(x, y) and ](x, y) then is the square root of the squared error averaged over
the M X N array, or

1
..
e
rms =
MN
[J(x,y)-f(x,y)]2
(8.1-8)
TABLE 8.2
IGS quantization
procedure.
Vector quantization
From Wikipedia, the free encyclopedia
Vector quantization (VQ) is a classical quantization technique from signal processing which allows the
modeling of probability density functions by the distribution of prototype vectors. It was originally used
for data compression. It works by dividing a large set of points (vectors) into groups having
approximately the same number of points closest to them. Each group is represented by its centroid
point, as in k-means and some other clustering algorithms.
The density matching property of vector quantization is powerful, especially for identifying the density
of large and high-dimensioned data. Since data points are represented by the index of their closest
centroid, commonly occurring data have low error, and rare data high error. This is why VQ is suitable
for lossy data compression. It can also be used for lossy data correction and density estimation.
Vector quantization is based on the competitive learning paradigm, so it is closely related to the self-
organizing map model.
Training
A simple training algorithm for vector quantization is:
1. Pick a sample point at random
2. Move the nearest quantization vector centroid towards this sample point, by a small fraction of the
distance
3. Repeat
Contents
1 Training
2 Applications
2.1 Use in data compression
2.2 Video codecs based on vector quantization
2.3 Audio codecs based on vector quantization
2.4 Use in pattern recognition
2.5 Use as clustering algorithm
2.6 Use in data stream mining
3 See also
4 References
5 External links
Page 1 of 6 Vector quantization - Wikipedia, the free encyclopedia
8/31/2014 http://en.wikipedia.org/wiki/Vector_quantization
A more sophisticated algorithm reduces the bias in the density matching estimation, and ensures that all
points are used, by including an extra sensitivity parameter:
1. Increase each centroid's sensitivity by a small amount
2. Pick a sample point at random
3. Find the quantization vector centroid with the smallest <distance-sensitivity>
1. Move the chosen centroid toward the sample point by a small fraction of the distance
2. Set the chosen centroid's sensitivity to zero
4. Repeat
It is desirable to use a cooling schedule to produce convergence: see Simulated annealing.
The algorithm can be iteratively updated with 'live' data, rather than by picking random points from a
data set, but this will introduce some bias if the data are temporally correlated over many samples. A
vector is represented either geometrically by an arrow whose length corresponds to its magnitude and
points in an appropriate direction, or by two or three numbers representing the magnitude of its
components.
Applications
Vector quantization is used for lossy data compression, lossy data correction, pattern recognition,
density estimation and clustering.
Lossy data correction, or prediction, is used to recover data missing from some dimensions. It is done by
finding the nearest group with the data dimensions available, then predicting the result based on the
values for the missing dimensions, assuming that they will have the same value as the group's centroid.
For density estimation, the area/volume that is closer to a particular centroid than to any other is
inversely proportional to the density (due to the density matching property of the algorithm).
Use in data compression
Vector quantization, also called "block quantization" or "pattern matching quantization" is often used in
lossy data compression. It works by encoding values from a multidimensional vector space into a finite
set of values from a discrete subspace of lower dimension. A lower-space vector requires less storage
space, so the data is compressed. Due to the density matching property of vector quantization, the
compressed data has errors that are inversely proportional to density.
The transformation is usually done by projection or by using a codebook. In some cases, a codebook can
be also used to entropy code the discrete value in the same step, by generating a prefix coded variable-
length encoded value as its output.
The set of discrete amplitude levels is quantized jointly rather than each sample being quantized
separately. Consider a k-dimensional vector of amplitude levels. It is compressed by
choosing the nearest matching vector from a set of n-dimensional vectors , with n < k.
Page 2 of 6 Vector quantization - Wikipedia, the free encyclopedia
8/31/2014 http://en.wikipedia.org/wiki/Vector_quantization
31/8/2014 Geometric Operations - Geometric Scaling
http://homepages.inf.ed.ac.uk/rbf/HPR2/scale.htm 1/8

Geometric Scaling
Common Names: Scale, Zoom, Shrink, Pixel Replication, Pixel Interpolation, Subsampling
Brief Description
The scale operator perIorms a geometric transIormation which can be used to shrink or zoom the size oI an
image (or part oI an image). Image reduction, commonly known as subsampling, is perIormed by replacement
(oI a group oI pixel values by one arbitrarily chosen pixel value Irom within this group) or by interpolating
between pixel values in a local neighborhoods. Image zooming is achieved by pixel replication or by
interpolation. Scaling is used to change the visual appearance oI an image, to alter the quantity oI inIormation
stored in a scene representation, or as a low-level preprocessor in multi-stage image processing chain which
operates on Ieatures oI a particular scale. Scaling is a special case oI aIIine transIormation.
How It Works
Scaling compresses or expands an image along the coordinate directions. As diIIerent techniques can be used to
subsample and zoom, each is discussed in turn.
Figure 1 illustrates the two methods oI sub-sampling. In the Iirst, one pixel value within a local neighborhood is
chosen (perhaps randomly) to be representative oI its surroundings. (This method is computationally simple, but
can lead to poor results iI the sampling neighborhoods are too large.) The second method interpolates between
pixel values within a neighborhood by taking a statistical sample (such as the mean) oI the local intensity values.
31/8/2014 Geometric Operations - Geometric Scaling
http://homepages.inf.ed.ac.uk/rbf/HPR2/scale.htm 2/8
Figure 1 Methods oI subsampling. a) Replacement with upper leIt pixel. b) Interpolation using the
mean value.
An image (or regions oI an image) can be zoomed either through pixel replication or interpolation. Figure 2
shows how pixel replication simply replaces each original image pixel by a group oI pixels with the same value
(where the group size is determined by the scaling Iactor). Alternatively, interpolation oI the values oI neighboring
pixels in the original image can be perIormed in order to replace each pixel with an expanded group oI pixels.
Most implementations oIIer the option oI increasing the actual dimensions oI the original image, or retaining them
and simply zooming a portion oI the image within the old image boundaries.
31/8/2014 File:Spectral effects of decimation compared on 3 popular frequency scale conventions.pdf - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/File:Spectral_effects_of_decimation_compared_on_3_popular_frequency_scale_conventions.pdf 1/3
File:Spectral effects of decimation compared on 3
popular frequency scale conventions.pdf
From Wikipedia, the Iree encyclopedia
Size oI this preview: 720 600 pixels.
Original Iile (1,800 1,500 pixels, Iile size: 26 KB, MIME type: application/pdI)
This is a Iile Irom the Wikimedia Commons. InIormation Irom its description page there is shown below.
Commons is a Ireely licensed media Iile repository. You can help.
Summary
Description English: Each oI 3 pairs oI graphs depicts the spectral distributions oI an oversampled Iunction and the
same Iunction sampled at 1/3 the original rate. The bandwidth, B, in this example is just small enough that
the slower sampling does not cause overlap (aliasing). The top pair oI graphs represent the discrete-time
Fourier transIorm (DTFT) representation. The middle pair, depict a normalized Irequency scale, preIerred
by many Iilter design programs. The Irequency, I, in Hz is divided by the sample-rate. The periodicity and
Nyquist Irequency are then represented by constants, 1 and 1/2 respectively. The bottom pair depict a
diIIerent normalized Irequency scale, used by the Z-transIorm... the periodicity and Nyquist Irequency are
31/8/2014 File:Spectral effects of decimation compared on 3 popular frequency scale conventions.pdf - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/File:Spectral_effects_of_decimation_compared_on_3_popular_frequency_scale_conventions.pdf 2/3
respectively represented by 2a and a.
Date 19 January 2014, 10:13:52
Source Own work
Author Bob K
Licensing
I, the copyright holder of this work, hereby publish it under the following license:

This Iile is made available under the Creative Commons CC0 1.0 Universal Public Domain Dedication
(https://creativecommons.org/publicdomain/zero/1.0/deed.en).
The person who associated a work with this deed has dedicated the work to the public domain by
waiving all oI his or her rights to the work worldwide under copyright law, including all related and
neighboring rights, to the extent allowed by law. You can copy, modiIy, distribute and perIorm the
work, even Ior commercial purposes, all without asking permission.
This media file is uncategorized.
Please help improve this media Iile by adding it to one or more categories, so it
may be associated with related media Iiles (how?), and so that it can be more
easily Iound.
Please notiIy the uploader with
subst:Please link images,File:Spectral eIIects oI decimation compared on 3
popular Irequency scale conventions.pdI}} ~~~~
File history
Click on a date/time to view the Iile as it appeared at that time.
Date/Time Thumbnail Dimensions User Comment
current 16:34, 19 January 2014
1,800 1,500
(26 KB)
Bob
K
rescale 4th graph
15:52, 19 January 2014
1,800 1,500
(26 KB)
Bob
K
User created page with
UploadWizard
31/8/2014 File:Spectral effects of decimation compared on 3 popular frequency scale conventions.pdf - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/File:Spectral_effects_of_decimation_compared_on_3_popular_frequency_scale_conventions.pdf 3/3
File usage
The Iollowing pages on the English Wikipedia link to this Iile (pages on other projects are not listed):
Decimation (signal processing)
Global file usage
The Iollowing other wikis use this Iile:
Usage on zh.wikipedia.org

Metadata
This Iile contains additional inIormation, probably added Irom the digital camera or scanner used to create or digitize it. II the
Iile has been modiIied Irom its original state, some details may not Iully reIlect the modiIied Iile.
Software used Draw
Conversion program LibreOIIice 4.1
Encrypted no
Page size 864 x 720 pts
Version of PDF format 1.4
Retrieved Irom
"http://en.wikipedia.org/wiki/File:SpectraleIIectsoIdecimationcomparedon3popularIrequencyscaleconventions.pdI"
31/8/2014 Decimation (signal processing) - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Decimation_(signal_processing) 1/5
Decimation (signal processing)
From Wikipedia, the Iree encyclopedia
In digital signal processing, decimation is the process oI reducing the sampling rate oI a signal.
|1||2||3|
Complementary to interpolation, which increases sampling rate, it is a speciIic case oI sample rate conversion in a
multi-rate digital signal processing system. Decimation utilises Iiltering to mitigate aliasing distortion, which can occur
when simply downsampling a signal.
|3|
A system component that perIorms decimation is called a decimator.
|2|
Contents
1 In general
2 By an integer Iactor
2.1 Anti-aliasing Iilter
3 By a rational Iactor
4 By an irrational Iactor
5 See also
6 Notes
7 Citations
8 ReIerences
In general
Decimation reduces the data rate or the size oI the data. The decimation Iactor is usually an integer or a rational
Iraction greater than one. This Iactor multiplies the sampling time or, equivalently, divides the sampling rate. For
example, iI 16-bit compact disc audio (sampled at 44,100 Hz) is decimated to 22,050 Hz, the audio is said to be
decimated by a Iactor oI 2. The bit rate is also reduced in halI, Irom 1,411,200 bit/s to 705,600 bit/s, assuming that
each sample retains its bit depth oI 16 bits.
By an integer factor
Decimation by an integer Iactor, M, can be explained as a 2-step process, with an equivalent implementation that is
more eIIicient:
1. Reduce high-Irequency signal components with a digital lowpass Iilter.
2. Downsample the Iiltered signal by M; that is, keep only every M
th
sample.
Downsampling alone causes high-Irequency signal components to be misinterpreted by subsequent users oI the
data, which is a Iorm oI distortion called aliasing. The Iirst step, iI necessary, is to suppress aliasing to an acceptable
level. In this application, the Iilter is called an anti-aliasing Iilter, and its design is discussed below. Also see
31/8/2014 Decimation (signal processing) - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Decimation_(signal_processing) 2/5
Fig.1: Spectral eIIects oI decimation
compared on 3 popular Irequency
scale conventions
undersampling Ior inIormation about downsampling bandpass Iunctions and signals.
When the anti-aliasing Iilter is an IIR design, it relies on Ieedback Irom output to input, prior to the downsampling
step. With FIR Iiltering, it is an easy matter to compute only every M
th
output. The calculation perIormed by a
decimating FIR Iilter Ior the n
th
output sample is a dot product:
where the h|| sequence is the impulse response, and K is its length. x|| represents the input sequence being
downsampled. In a general purpose processor, aIter computing y|n|, the easiest way to compute y|n1| is to
advance the starting index in the x|| array by M, and recompute the dot product. In the case M2, h|| can be
designed as a halI-band Iilter, where almost halI oI the coeIIicients are zero and need not be included in the dot
products.
Impulse response coeIIicients taken at intervals oI M Iorm a subsequence, and there are M such subsequences
(phases) multiplexed together. The dot product is the sum oI the dot products oI each subsequence with the
corresponding samples oI the x|| sequence. Furthermore, because oI downsampling by M, the stream oI x||
samples involved in any one oI the M dot products is never involved in the other dot products. Thus M low-order
FIR Iilters are each Iiltering one oI M multiplexed phases oI the input stream, and the M outputs are being summed.
This viewpoint oIIers a diIIerent implementation that might be advantageous in a multi-processor architecture. In
other words, the input stream is demultiplexed and sent through a bank oI M Iilters whose outputs are summed.
When implemented that way, it is called a polyphase filter.
For completeness, we now mention that a possible, but unlikely, implementation oI each phase is to replace the
coeIIicients oI the other phases with zeros in a copy oI the h|| array, process the original x|| sequence at the input
rate, and decimate the output by a Iactor oI M. The equivalence oI this ineIIicient method and the implementation
described above is known as the first Noble iaentity.
|4|
Anti-aliasing filter
The requirements oI the anti-aliasing Iilter can be deduced Irom any oI
the 3 pairs oI graphs in Fig. 1. Note that all 3 pairs are identical, except
Ior the units oI the abscissa variables. The upper graph oI each pair is an
example oI the periodic Irequency distribution oI a sampled Iunction, x(t),
with Fourier transIorm, X(I). The lower graph is the new distribution that
results when x(t) is sampled 3 times slower, or (equivalently) when the
original sample sequence is decimated by a Iactor oI M3. In all 3 cases,
the condition that ensures the copies oI X(I) don't overlap each other is
the same: where T is the interval between samples, 1/T
is the sample-rate, and 1/2T is the Nyquist Irequency. The anti-aliasing
Iilter that can ensure the condition is met has a cutoII Irequency less than
times the Nyquist Irequency.
|note 1|
The abscissa oI the top pair oI graphs represents the discrete-time Fourier transIorm (DTFT), which is a Fourier
series representation oI a periodic summation oI X(I):
31/8/2014 Decimation (signal processing) - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Decimation_(signal_processing) 3/5



(Eq.1)
When T has units oI seconds, has units oI hertz. Replacing T with MT in the Iormulas above gives the DTFT oI
the decimated sequence, x|nM|:
The periodic summation has been reduced in amplitude and periodicity by a Iactor oI M, as depicted in the second
graph oI Fig. 1. Aliasing occurs when adjacent copies oI X(I) overlap. The purpose oI the anti-aliasing Iilter is to
ensure that the reduced periodicity does not create overlap.
In the middle pair oI graphs, the Irequency variable, has been replaced by normalized Irequency, which creates a
periodicity oI 1 and a Nyquist Irequency oI . A common practice in Iilter design programs is to assume those
values and request only the corresponding cutoII Irequency in the same units. In other words, the cutoII Irequency
is normalized to The units oI this quantity are
(seconas/sample)(cycles/secona) cycles/sample.
The bottom pair oI graphs represent the Z-transIorms oI the original sequence and the decimated sequence,
constrained to values oI complex-variable, z, oI the Iorm Then the transIorm oI the x|n| sequence has
the Iorm oI a Fourier series. By comparison with Eq.1, we deduce:
which is depicted by the IiIth graph in Fig. 1. Similarly, the sixth graph depicts:
By a rational factor
Let M/L denote the decimation Iactor, where: M, L ; M ~ L.
1. Interpolate by a Iactor oI L
2. Decimate by a Iactor oI M
31/8/2014 Decimation (signal processing) - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Decimation_(signal_processing) 4/5
Interpolation requires a lowpass Iilter aIter increasing the data rate, and decimation requires a lowpass Iilter beIore
decimation. ThereIore, both operations can be accomplished by a single Iilter with the lower oI the two cutoII
Irequencies. For the M ~ L case, the anti-aliasing Iilter cutoII, cycles per intermeaiate sample, is the lower
Irequency.
By an irrational factor
Techniques Ior decimation (and sample-rate conversion in general) by Iactor R

include polynomial
interpolation and the Farrow structure.
|5|
See also
Oversampling
Posterization
Notes
1. ^ Realizable low-pass Iilters have a "skirt", where the response diminishes Irom near one to near zero. So in
practice the cutoII Irequency is placed Iar enough below the theoretical cutoII that the Iilter's skirt is contained
below the theoretical cutoII.
Citations
1. ^ Lyons, Richard (2001). Unaerstanaing Digital Signal Processing. Prentice Hall. p. 304. ISBN 0-201-63467-8.
"Decreasing the sampling rate is known as decimation."
2. `
a

b
Antoniou, Andreas (2006). Digital Signal Processing. McGraw-Hill. p. 830. ISBN 0-07-145424-1.
"Decimators can be used to reduce the sampling Irequency, whereas interpolators can be used to increase it."
3. `
a

b
Milic, Ljiljana (2009). Multirate Filtering for Digital Signal Processing. New York: Hershey. p. 35.
ISBN 978-1-60566-178-0. "Sampling rate conversion systems are used to change the sampling rate oI a signal.
The process oI sampling rate decrease is called decimation, and the process oI sampling rate increase is called
interpolation."
4. ^ Strang, Gilbert; Nguyen, Truong (1996-10-01). Wavelets ana Filter Banks (http://books.google.com/books?
idZ76NAb5pp8C&pgPA101&lpgPA101&dqnobleidentitiesstrang&sourcebl&otsqUVRdmBh1&sigj6J
bIy66PbqFVdrhigCHnjerNiI&hlen&saX&eiXKmtUsFEKbD2AWinIHACQ&ved0CC8Q6AEwAA#vonepage
&qnoble20identities20strang&IIalse) (2 ed.). Wellesley,MA: Wellesley-Cambridge Press. pp. 100101.
ISBN 0961408871.
5. ^ Milic, Ljiljana (2009). Multirate Filtering for Digital Signal Processing. New York: Hershey. p. 192. ISBN 978-
1-60566-178-0. "Generally, this approach is applicable when the ratio Fy/Fx is a rational, or an irrational number,
and is suitable Ior the sampling rate increase and Ior the sampling rate decrease."
14/8/2014 Chroma subsampling - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Chroma_subsampling 1/10
Chroma subsampling
From Wikipedia, the Iree encyclopedia
Chroma subsampling is the practice oI encoding images by implementing less resolution Ior chroma inIormation
than Ior luma inIormation, taking advantage oI the human visual system's lower acuity Ior color diIIerences than Ior
luminance.
|1|
It is used in many video encoding schemes both analog and digital and also in JPEG encoding.
Contents
1 Rationale
2 How subsampling works
3 Sampling systems and ratios
4 Types oI subsampling
4.1 4:4:4 Y'CbCr
4.2 4:4:4 R'G'B' (no subsampling)
4.3 4:2:2
4.4 4:2:1
4.5 4:1:1
4.6 4:2:0
4.7 4:1:0
4.8 3:1:1
5 Out-oI-gamut colors
6 Terminology
7 History
8 EIIectiveness
9 Compatibility issues
10 See also
11 ReIerences
Rationale
Because oI storage and transmission limitations, there is Irequently a desire to reduce (or compress) the signal.
Since the human visual system is much more sensitive to variations in brightness than color, a video system can be
optimized by devoting more bandwidth to the luma component (usually denoted Y'), than to the color diIIerence
components Cb and Cr. In compressed images, Ior example, the 4:2:2 Y'CbCr scheme requires two-thirds the
14/8/2014 Chroma subsampling - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Chroma_subsampling 2/10
In Iull size, this image shows the
diIIerence between Iour subsampling
schemes. Note how similar the color
images appear. The lower row shows
the resolution oI the color
inIormation.
bandwidth oI (4:4:4) R'G'B'. This reduction results in almost no visual diIIerence as perceived by the viewer Ior
photographs, although images produced digitally containing harsh lines and saturated colors will have signiIicant
artiIacts.
How subsampling works
Because the human visual system is less sensitive to the position and
motion oI color than luminance,
|2|
bandwidth can be optimized by storing
more luminance detail than color detail. At normal viewing distances,
there is no perceptible loss incurred by sampling the color detail at a
lower rate. In video systems, this is achieved through the use oI color
diIIerence components. The signal is divided into a luma (Y') component
and two color diIIerence components (chroma).
In human vision there are two chromatic channels as well as a luminance
channel, and in color science there are two chromatic dimensions as well
as a luminance dimension. In neither the vision nor the science is there
complete independence oI the chromatic and the luminance. Luminance
inIormation can be gleaned Irom the chromatic inIormation; e.g. the chromatic value implies a certain minimum Ior
the luminance value. But there can be no question oI color inIluencing luminance in the absence oI a post-processing
oI the separate signals. In video, the luma and chroma components are Iormed as a weighted sum oI gamma-
correctea (tristimulus) R'G'B' components instead oI linear (tristimulus) RGB components. As a result, luma must
be distinguished Irom luminance. That there is some "bleeding" oI luminance and color inIormation between the luma
and chroma components in video, the error being greatest Ior highly saturated colors and noticeable in between the
magenta and green bars oI a color bars test pattern (that has chroma subsampling applied), should not be attributed
to this engineering approximation being used. Indeed similar bleeding can occur also with gamma 1, whence the
reversing oI the order oI operations between gamma correction and Iorming the weighted sum can make no
diIIerence. The chroma can inIluence the luma speciIically at the pixels where the subsampling put no chroma.
Interpolation may then put chroma values there which are incompatible with the luma value there, and Iurther post-
processing oI that Y'CbCr into R'G'B' Ior that pixel is what ultimately produces Ialse luminance upon display.
Original without color subsampling. 200 zoom.
Image aIter color subsampling (compressed with Sony Vegas DV codec, box Iiltering applied.)
Sampling systems and ratios
14/8/2014 Chroma subsampling - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Chroma_subsampling 3/10
The subsampling scheme is commonly expressed as a three part ratio J:a:b (e.g. 4:2:2), although sometimes
expressed as Iour parts (e.g. 4:2:2:4), that describe the number oI luminance and chrominance samples in a
conceptual region that is J pixels wide, and 2 pixels high. The parts are (in their respective order):
J: horizontal sampling reIerence (width oI the conceptual region). Usually, 4.
a: number oI chrominance samples (Cr, Cb) in the Iirst row oI J pixels.
b: number oI (additional) chrominance samples (Cr, Cb) in the second row oI J pixels.
Alpha: horizontal Iactor (relative to Iirst digit). May be omitted iI alpha component is not present, and is
equal to J when present.
An explanatory image oI diIIerent chroma subsampling schemes can be seen at the Iollowing link:
http://lea.hamradio.si/~s51kq/subsample.giI (source: "Basics oI Video": http://lea.hamradio.si/~s51kq/V-
BAS.HTM) or in details in Chrominance Subsampling in Digital Images, by Douglas Kerr
(http://dougkerr.net/pumpkin/articles/Subsampling.pdI).
4:1:1


4:2:0


4:2:2


4:4:4


4:4:0
Y'CrCb



Y'


+ + + + +
1 2 3 4 J 4 1 2 3 4 J 4 1 2 3 4 J 4 1 2 3 4 J 4 1 2 3 4 J 4
(Cr,
Cb)
1 a 1 1 2 a 2 1 2 a 2 1 2 3 4 a 4 1 2 3 4 a 4
1 b 1 b 0 1 2 b 2 1 2 3 4 b 4 b 0
/ horizontal
resolution,
Iull vertical
resolution
horizontal
resolution,
vertical
resolution
horizontal
resolution,
Iull vertical
resolution
Iull horizontal
resolution,
Iull vertical
resolution
Iull horizontal
resolution,
vertical
resolution
The mapping examples given are only theoretical and Ior illustration. Also note that the diagram does not indicate
any chroma Iiltering, which should be applied to avoid aliasing.
To calculate required bandwidth Iactor relative to 4:4:4 (or 4:4:4:4), one needs to sum all the Iactors and divide the
result by 12 (or 16, iI alpha is present).
Types of subsampling
4:4:4 Y'CbCr
Each oI the three Y'CbCr components have the same sample rate. This scheme is sometimes used in high-end Iilm
scanners and cinematic postproduction. Two SDI links (connections) are normally required to carry this bandwidth:
Link A would carry a 4:2:2 signal, Link B a 0:2:2, when combined would make 4:4:4.
Official 4:4:4 / Chroma Subsampling .
www.avsforum.com - 683 253 - Search by image
To sum it up, 4:4:4 means no subsampling is used at all -- so your
image is displayed in its purest form. Anything less than 4:4:4
means original color ...
Visit page View image
Related images:
mages may be subject to copyright. - Send f eedback
Web Images Videos News Maps Search tools More SafeSearch
Sign in subsampling
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 6/18
Common 1PEG markers
|16|
Short
name
Bytes Payload Name Comments
SOI
0xFF,
0xD8
none Start OI Image
SOF0
0xFF,
0xC0
variable
si:e
Start OI Frame
(Baseline
DCT)
Indicates that this is a baseline DCT-based JPEG, and speciIies the width, height, number oI components, and component
subsampling (e.g., 4:2:0).
SOF2
0xFF,
0xC2
variable
si:e
Start OI Frame
(Progressive
DCT)
Indicates that this is a progressive DCT-based JPEG, and speciIies the width, height, number oI components, and component
subsampling (e.g., 4:2:0).
DHT
0xFF,
0xC4
variable
si:e
DeIine HuIIman
Table(s)
SpeciIies one or more HuIIman tables.
DQT
0xFF,
0xDB
variable
si:e
DeIine
Quantization
Table(s)
SpeciIies one or more quantization tables.
DRI
0xFF,
0xDD
4 bytes
DeIine Restart
Interval
SpeciIies the interval between RSTn markers, in macroblocks. This marker is Iollowed by two bytes indicating the Iixed size so
it can be treated like any other variable size segment.
SOS
0xFF,
0xDA
variable
si:e
Start OI Scan
Begins a top-to-bottom scan oI the image. In baseline DCT JPEG images, there is generally a single scan. Progressive DCT
JPEG images usually contain multiple scans. This marker speciIies which slice oI data it will contain, and is immediately Iollowed
by entropy-coded data.
RSTn
0xFF,
0xDn
(n0..7)
none Restart
Inserted every r macroblocks, where r is the restart interval set by a DRI marker. Not used iI there was no DRI marker. The
low 3 bits oI the marker code cycle in value Irom 0 to 7.
APPn
0xFF,
0xEn
variable
si:e
Application-
speciIic
For example, an ExiI JPEG Iile uses an APP1 marker to store metadata, laid out in a structure based closely on TIFF.
COM
0xFF,
0xFE
variable
si:e
Comment Contains a text comment.
EOI
0xFF,
0xD9
none End OI Image
There are other Start Of Frame markers that introduce other kinds oI JPEG encodings.
Since several vendors might use the same APPn marker type, application-speciIic markers oIten begin with a standard or vendor name (e.g., "ExiI" or "Adobe") or some
other identiIying string.
At a restart marker, block-to-block predictor variables are reset, and the bitstream is synchronized to a byte boundary. Restart markers provide means Ior recovery aIter
bitstream error, such as transmission over an unreliable network or Iile corruption. Since the runs oI macroblocks between restart markers may be independently decoded,
these runs may be decoded in parallel.
1PEG codec example
Although a JPEG Iile can be encoded in various ways, most commonly it is done with JFIF encoding. The encoding process consists oI several steps:
1. The representation oI the colors in the image is converted Irom RGB to Y'C
B
C
R
, consisting oI one luma component (Y'), representing brightness, and two chroma
components, (C
B
and C
R
), representing color. This step is sometimes skipped.
2. The resolution oI the chroma data is reduced, usually by a Iactor oI 2 or 3. This reIlects the Iact that the eye is less sensitive to Iine color details than to Iine brightness
details.
3. The image is split into blocks oI 88 pixels, and Ior each block, each oI the Y, C
B
, and C
R
data undergoes the Discrete Cosine TransIorm (DCT), which was
developed in 1974 by N. Ahmed, T. Natarajan and K. R. Rao; see Citation 1 in Discrete cosine transIorm. A DCT is similar to a Fourier transIorm in the sense that it
produces a kind oI spatial Irequency spectrum.
4. The amplitudes oI the Irequency components are quantized. Human vision is much more sensitive to small variations in color or brightness over large areas than to the
strength oI high-Irequency brightness variations. ThereIore, the magnitudes oI the high-Irequency components are stored with a lower accuracy than the low-Irequency
components. The quality setting oI the encoder (Ior example 50 or 95 on a scale oI 0100 in the Independent JPEG Group's library
|17|
) aIIects to what extent the
resolution oI each Irequency component is reduced. II an excessively low quality setting is used, the high-Irequency components are discarded altogether.
5. The resulting data Ior all 88 blocks is Iurther compressed with a lossless algorithm, a variant oI HuIIman encoding.
The decoding process reverses these steps, except the quanti:ation because it is irreversible. In the remainder oI this section, the encoding and decoding processes are
described in more detail.
Encoding
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 7/18
The 88 sub-image shown in 8-bit
grayscale
Many oI the options in the JPEG standard are not commonly used, and as mentioned above, most image soItware uses the simpler JFIF Iormat when creating a JPEG Iile,
which among other things speciIies the encoding method. Here is a brieI description oI one oI the more common methods oI encoding when applied to an input that has 24
bits per pixel (eight each oI red, green, and blue). This particular option is a lossy data compression method.
Color space transformation
First, the image should be converted Irom RGB into a diIIerent color space called Y'C
B
C
R
(or, inIormally, YCbCr). It has three components Y', C
B
and C
R
: the Y'
component represents the brightness oI a pixel, and the C
B
and C
R
components represent the chrominance (split into blue and red components). This is basically the same
color space as used by digital color television as well as digital video including video DVDs, and is similar to the way color is represented in analog PAL video and MAC (but
not by analog NTSC, which uses the YIQ color space). The Y'C
B
C
R
color space conversion allows greater compression without a signiIicant eIIect on perceptual image
quality (or greater perceptual image quality Ior the same compression). The compression is more eIIicient because the brightness inIormation, which is more important to the
eventual perceptual quality oI the image, is conIined to a single channel. This more closely corresponds to the perception oI color in the human visual system. The color
transIormation also improves compression by statistical decorrelation.
A particular conversion to Y'C
B
C
R
is speciIied in the JFIF standard, and should be perIormed Ior the resulting JPEG Iile to have maximum compatibility. However, some
JPEG implementations in "highest quality" mode do not apply this step and instead keep the color inIormation in the RGB color model, where the image is stored in separate
channels Ior red, green and blue brightness components. This results in less eIIicient compression, and would not likely be used when Iile size is especially important.
Downsampling
Due to the densities oI color- and brightness-sensitive receptors in the human eye, humans can see considerably more Iine detail in the brightness oI an image (the Y'
component) than in the hue and color saturation oI an image (the Cb and Cr components). Using this knowledge, encoders can be designed to compress images more
eIIiciently.
The transIormation into the Y'C
B
C
R
color model enables the next usual step, which is to reduce the spatial resolution oI the Cb and Cr components (called "downsampling" or
"chroma subsampling"). The ratios at which the downsampling is ordinarily done Ior JPEG images are 4:4:4 (no downsampling), 4:2:2 (reduction by a Iactor oI 2 in the
horizontal direction), or (most commonly) 4:2:0 (reduction by a Iactor oI 2 in both the horizontal and vertical directions). For the rest oI the compression process, Y', Cb and
Cr are processed separately and in a very similar manner.
Block splitting
AIter subsampling, each channel must be split into 88 blocks. Depending on chroma subsampling, this yields (Minimum Coded Unit) MCU blocks oI size 88 (4:4:4 no
subsampling), 168 (4:2:2), or most commonly 1616 (4:2:0). In video compression MCUs are called macroblocks.
II the data Ior a channel does not represent an integer number oI blocks then the encoder must Iill the remaining area oI the incomplete blocks with some Iorm oI dummy data.
Filling the edges with a Iixed color (Ior example, black) can create ringing artiIacts along the visible part oI the border; repeating the edge pixels is a common technique that
reduces (but does not necessarily completely eliminate) such artiIacts, and more sophisticated border Iilling techniques can also be applied.
Discrete cosine transform
Next, each 88 block oI each component (Y, Cb, Cr) is converted to a Irequency-domain representation, using a
normalized, two-dimensional type-II discrete cosine transIorm (DCT), which was introduced by N. Ahmed, T. Natarajan and
K. R. Rao in 1974; see Citation 1 in Discrete cosine transIorm. The DCT is sometimes reIerred to as "type-II DCT" in the
context oI a Iamily oI transIorms as in discrete cosine transIorm, and the corresponding inverse (IDCT) is denoted as "type-III
DCT".
As an example, one such 88 8-bit subimage might be:
BeIore computing the DCT oI the 88 block, its values are shiIted Irom a positive range to one centered around zero. For an 8-bit image, each entry in the original block Ialls
in the range . The midpoint oI the range (in this case, the value 128) is subtracted Irom each entry to produce a data range that is centered around zero, so that the
modiIied range is . This step reduces the dynamic range requirements in the DCT processing stage that Iollows. (Aside Irom the diIIerence in dynamic range
within the DCT stage, this step is mathematically equivalent to subtracting 1024 Irom the DC coeIIicient aIter perIorming the transIorm which may be a better way to
perIorm the operation on some architectures since it involves perIorming only one subtraction rather than 64 oI them.)
This step results in the Iollowing values:
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 8/18
The DCT transIorms an 88 block oI
input values to a linear combination oI
these 64 patterns. The patterns are
reIerred to as the two-dimensional
DCT basis functions, and the output
values are reIerred to as transform
coefficients. The horizontal index is
and the vertical index is .
The next step is to take the two-dimensional DCT, which is given by:
where
is the horizontal spatial Irequency, Ior the integers .
is the vertical spatial Irequency, Ior the integers .
is a normalizing scale Iactor to make the transIormation orthonormal
is the pixel value at coordinates
is the DCT coeIIicient at coordinates
II we perIorm this transIormation on our matrix above, we get the Iollowing (rounded to the nearest two digits beyond the decimal
point):
Note the top-leIt corner entry with the rather large magnitude. This is the DC coeIIicient. The remaining 63 coeIIicients are called the AC coeIIicients. The advantage oI the
DCT is its tendency to aggregate most oI the signal in one corner oI the result, as may be seen above. The quantization step to Iollow accentuates this eIIect while
simultaneously reducing the overall size oI the DCT coeIIicients, resulting in a signal that is easy to compress eIIiciently in the entropy stage.
The DCT temporarily increases the bit-depth oI the data, since the DCT coeIIicients oI an 8-bit/component image take up to 11 or more bits (depending on Iidelity oI the
DCT calculation) to store. This may Iorce the codec to temporarily use 16-bit bins to hold these coeIIicients, doubling the size oI the image representation at this point; they
are typically reduced back to 8-bit values by the quantization step. The temporary increase in size at this stage is not a perIormance concern Ior most JPEG implementations,
because typically only a very small part oI the image is stored in Iull DCT Iorm at any given time during the image encoding or decoding process.
Quantization
The human eye is good at seeing small diIIerences in brightness over a relatively large area, but not so good at distinguishing the exact strength oI a high Irequency brightness
variation. This allows one to greatly reduce the amount oI inIormation in the high Irequency components. This is done by simply dividing each component in the Irequency
domain by a constant Ior that component, and then rounding to the nearest integer. This rounding operation is the only lossy operation in the whole process (other than chroma
subsampling) iI the DCT computation is perIormed with suIIiciently high precision. As a result oI this, it is typically the case that many oI the higher Irequency components are
rounded to zero, and many oI the rest become small positive or negative numbers, which take many Iewer bits to represent.
The elements in the quantization matrix control the compression ratio, with larger values producing greater compression. A typical quantization matrix (Ior a quality oI 50 as
speciIied in the original JPEG Standard), is as Iollows:
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 9/18
Zigzag ordering oI JPEG image
components
The quantized DCT coeIIicients are computed with
where is the unquantized DCT coeIIicients; is the quantization matrix above; and is the quantized DCT coeIIicients.
Using this quantization matrix with the DCT coeIIicient matrix Irom above results in:
For example, using 415 (the DC coeIIicient) and rounding to the nearest integer
Entropy coding
Entropy coding is a special Iorm oI lossless data compression. It involves arranging the image components in a "zigzag" order
employing run-length encoding (RLE) algorithm that groups similar Irequencies together, inserting length coding zeros, and then using
HuIIman coding on what is leIt.
The JPEG standard also allows, but does not require, decoders to support the use oI arithmetic coding, which is mathematically
superior to HuIIman coding. However, this Ieature has rarely been used as it was historically covered by patents requiring royalty-
bearing licenses, and because it is slower to encode and decode compared to HuIIman coding. Arithmetic coding typically makes
Iiles about 57 smaller.
The previous quantized DC coeIIicient is used to predict the current quantized DC coeIIicient. The diIIerence between the two is
encoded rather than the actual value. The encoding oI the 63 quantized AC coeIIicients does not use such prediction diIIerencing.
The zigzag sequence Ior the above quantized coeIIicients are shown below. (The Iormat shown is just Ior ease oI
understanding/viewing.)
26
3 0
3 2 6
2 4 1 3
1 1 5 1 2
1 1 1 2 0 0
0 0 0 -1 -1 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0
0 0 0 0
0 0 0
0 0
0
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 10/18
Baseline sequential JPEG encoding and decoding processes
II the i-th block is represented by and positions within each block are represented by where and , then any coeIIicient in the
DCT image can be represented as . Thus, in the above scheme, the order oI encoding pixels (Ior the -th block) is , , , ,
, , , and so on.
This encoding mode is called baseline sequential encoding. Baseline JPEG also supports progressive
encoding. While sequential encoding encodes coeIIicients oI a single block at a time (in a zigzag manner),
progressive encoding encodes similar-positioned coeIIicients oI all blocks in one go, Iollowed by the next
positioned coeIIicients oI all blocks, and so on. So, iI the image is divided into N 88 blocks
, then progressive encoding encodes Ior all blocks, i.e., Ior all
. This is Iollowed by encoding coeIIicient oI all blocks, Iollowed by
-th coeIIicient oI all blocks, then -th coeIIicient oI all blocks, and so on.
It should be noted here that once all similar-positioned coeIIicients have been encoded, the next position to be
encoded is the one occurring next in the zigzag traversal as indicated in the Iigure above. It has been Iound that
Baseline Progressive JPEG encoding usually gives better compression as compared to Baseline Sequential
JPEG due to the ability to use diIIerent HuIIman tables (see below) tailored Ior diIIerent Irequencies on each
"scan" or "pass" (which includes similar-positioned coeIIicients), though the diIIerence is not too large.
In the rest oI the article, it is assumed that the coeIIicient pattern generated is due to sequential mode.
In order to encode the above generated coeIIicient pattern, JPEG uses HuIIman encoding. The JPEG standard provides general-purpose HuIIman tables; encoders may also
choose to generate HuIIman tables optimized Ior the actual Irequency distributions in images being encoded.
The process oI encoding the zig-zag quantized data begins with a run-length encoding explained below, where
is the non-zero, quantized AC coeIIicient.
RUNLENGTH is the number oI zeroes that came beIore this non-zero AC coeIIicient.
SIZE is the number oI bits required to represent .
AMPLITUDE is the bit-representation oI .
The run-length encoding works by examining each non-zero AC coeIIicient and determining how many zeroes came beIore the previous AC coeIIicient. With this
inIormation, two symbols are created:
Symbol 1 Symbol 2
(RUNLENGTH, SIZE) (AMPLITUDE)
Both RUNLENGTH and SIZE rest on the same byte, meaning that each only contains 4-bits oI inIormation. The higher bits deal with the number oI zeroes, while the lower
bits denote the number oI bits necessary to encode the value oI .
This has the immediate implication oI Symbol 1 being only able store inIormation regarding the Iirst 15 zeroes preceding the non-zero AC coeIIicient. However, JPEG deIines
two special HuIIman code words. One is Ior ending the sequence prematurely when the remaining coeIIicients are zero (called "End-oI-Block" or "EOB"), and another when
the run oI zeroes goes beyond 15 beIore reaching a non-zero AC coeIIicient. In such a case where 16 zeroes are encountered beIore a given non-zero AC coeIIicient,
Symbol 1 is encoded "specially" as: (15, 0)(0).
The overall process continues until "EOB" - denoted by (0, 0) - is reached.
With this in mind, the sequence Irom earlier becomes:
(0, 2)(-3); (1, 2)(-3); (0, 2)(-2); (0, 3)(-6); (0, 2)(2); (0, 3)(-4); (0, 1)(1); (0, 2)(-3); (0, 1)(1);
(0, 1)(1); (0, 3)(5); (0, 1)(1); (0, 2)(2); (0, 1)(-1); (0, 1)(1); (0, 2)(2); (5, 1)(-1); (0, 1)(-1); (0, 0).
(The Iirst value in the matrix, -26, is the DC coeIIicient; it is not encoded the same way. See above.)
From here, Irequency calculations are made based on occurrences oI the coeIIicients. In our example block, most oI the quantized coeIIicients are small numbers that are not
preceded immediately by a zero coeIIicient. These more-Irequent cases will be represented by shorter code words.
Compression ratio and artifacts
The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase. Ten to one compression
usually results in an image that cannot be distinguished by eye Irom the original. 100 to one compression is usually possible, but will look distinctly artiIacted compared to the
original. The appropriate level oI compression depends on the use to which the image will be put.
Those who use the World Wide Web may be Iamiliar with the irregularities known as compression artiIacts that appear in JPEG images, which may take the Iorm oI noise
around contrasting edges (especially curves and corners), or 'blocky' images. These are due to the quantization step oI the JPEG algorithm. They are especially noticeable
around sharp corners between contrasting colors (text is a good example as it contains many such corners). The analogous artiIacts in MPEG video are reIerred to as
mosquito noise, as the resulting "edge busyness" and spurious dots, which change over time, resemble mosquitoes swarming around the object.
|18||19|
These artiIacts can be reduced by choosing a lower level oI compression; they may be eliminated by saving an image using a lossless Iile Iormat, though Ior photographic
images this will usually result in a larger Iile size. The images created with ray-tracing programs have noticeable blocky shapes on the terrain. Certain low-intensity
compression artiIacts might be acceptable when simply viewing the images, but can be emphasized iI the image is subsequently processed, usually resulting in unacceptable
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 11/18
This image shows the pixels that are
diIIerent between a non-compressed image
and the same image JPEG compressed with
a quality setting oI 50. Darker means a
larger diIIerence. Note especially the
changes occurring near sharp edges and
having a block-like shape.
The compressed 88-squares
are visible in the scaled up
picture, together with other
visual artiIacts oI the lossy
compression.
External images
Illustration oI edge busyness
(http://i.cmpnet.com/videsignline/2006/02/algolith-
Iig2.jpg)
.|18|
quality. Consider the example below, demonstrating the eIIect oI lossy compression on an edge detection processing step.
Image Lossless compression Lossy compression
Original
Processed by
Canny edge detector
Some programs allow the user to vary the amount by which individual blocks are compressed. Stronger compression is
applied to areas oI the image that show Iewer artiIacts. This way it is possible to manually reduce JPEG Iile size with less loss
oI quality.
JPEG artiIacts, like pixelation, are occasionally intentionally exploited Ior artistic purposes, as in Jpegs, by German
photographer Thomas RuII.
|20||21|
Since the quantization stage always results in a loss oI inIormation, JPEG standard is always a lossy compression codec.
(InIormation is lost both in quantizing and rounding oI the Iloating-point numbers.) Even iI the quantization matrix is a matrix oI
ones, inIormation will still be lost in the rounding step.
Decoding
Decoding to display the image consists oI doing all the above in reverse.
Taking the DCT coeIIicient matrix (aIter adding the diIIerence oI the DC coeIIicient back in)
and taking the entry-Ior-entry product with the quantization matrix Irom above results in
which closely resembles the original DCT coeIIicient matrix Ior the top-leIt portion.
The next step is to take the two-dimensional inverse DCT (a 2D type-III DCT), which is given by:
where
is the pixel row, Ior the integers .
is the pixel column, Ior the integers .
is deIined as above, Ior the integers .
is the reconstructed approximate coeIIicient at coordinates
is the reconstructed pixel value at coordinates
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 12/18
Notice the slight diIIerences
between the original (top) and
decompressed image (bottom),
which is most readily seen in
the bottom-leIt corner.
Rounding the output to integer values (since the original had integer values) results in an image with values (still shiIted down by 128)
and adding 128 to each entry
This is the decompressed subimage. In general, the decompression process may produce values outside oI the original input range oI
. II this occurs, the decoder needs to clip the output values keep them within that range to prevent overIlow when storing the
decompressed image with the original bit depth.
The decompressed subimage can be compared to the original subimage (also see images to the right) by taking the diIIerence (original uncompressed) results in the
Iollowing error values:
with an average absolute error oI about 5 values per pixels (i.e., ).
The error is most noticeable in the bottom-leIt corner where the bottom-leIt pixel becomes darker than the pixel to its immediate right.
Required precision
The encoding description in the JPEG standard does not Iix the precision needed Ior the output compressed image. However, the JPEG standard (and the similar MPEG
standards) includes some precision requirements Ior the aecoding, including all parts oI the decoding process (variable length decoding, inverse DCT, dequantization,
renormalization oI outputs); the output Irom the reIerence algorithm must not exceed:
a maximum 1 bit oI diIIerence Ior each pixel component
low mean square error over each 88-pixel block
very low mean error over each 88-pixel block
very low mean square error over the whole image
extremely low mean error over the whole image
These assertions are tested on a large set oI randomized input images, to handle the worst cases. The Iormer IEEE 11801990 standard contained some similar precision
requirements. The precision has a consequence on the implementation oI decoders, and it is critical because some encoding processes (notably used Ior encoding sequences
oI images like MPEG) need to be able to construct, on the encoder side, a reIerence decoded image. In order to support 8-bit precision per pixel component output,
dequantization and inverse DCT transIorms are typically implemented with at least 14-bit precision in optimized decoders.
Effects of 1PEG compression
7/8/2014 JPEG - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/JPEG 13/18
JPEG compression artiIacts blend well into photographs with detailed non-uniIorm textures, allowing higher compression ratios. Notice how a higher compression ratio Iirst
aIIects the high-Irequency textures in the upper-leIt corner oI the image, and how the contrasting lines become more Iuzzy. The very high compression ratio severely aIIects the
quality oI the image, although the overall colors and image Iorm are still recognizable. However, the precision oI colors suIIer less (Ior a human eye) than the precision oI
contours (based on luminance). This justiIies the Iact that images should be Iirst transIormed in a color model separating the luminance Irom the chromatic inIormation, beIore
subsampling the chromatic planes (which may also use lower quality quantization) in order to preserve the precision oI the luminance plane with more inIormation bits.
Sample photographs
For inIormation, the uncompressed 24-bit RGB bitmap image below (73,242 pixels) would require 219,726 bytes (excluding all other inIormation headers). The Iilesizes
indicated below include the internal JPEG inIormation headers and some meta-data. For highest quality images (Q100), about 8.25 bits per color pixel is required. On
grayscale images, a minimum oI 6.5 bits per pixel is enough (a comparable Q100 quality color inIormation requires about 25 more encoded bits). The highest quality
image below (Q100) is encoded at 9 bits per color pixel, the medium quality image (Q25) uses 1 bit per color pixel. For most applications, the quality Iactor should not go
below 0.75 bit per pixel (Q12.5), as demonstrated by the low quality image. The image at lowest quality uses only 0.13 bit per pixel, and displays very poor color. This is
useIul when the image will be displayed in a signiIicantly scaled down size.
Image Quality
Size
(bytes)
Compression
ratio
Comment
Highest quality
(Q 100)
83,261 2.6:1 Extremely minor artiIacts
High quality
(Q 50)
15,138 15:1 Initial signs oI subimage artiIacts
Medium quality
(Q 25)
9,553 23:1 Stronger artiIacts; loss oI high Irequency inIormation
Low quality
(Q 10)
4,787 46:1
Severe high Irequency loss; artiIacts on subimage boundaries
("macroblocking") are obvious
7/8/2014 Hadamard transform - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Hadamard_transform 1/5
The product oI a Boolean Iunction and a Walsh
matrix is its Walsh spectrum:
|1|
(1,0,1,0,0,1,1,0) * H(8) (4,2,0,2,0,2,0,2)
Fast WalshHadamard transIorm
This is a Iaster way to calculate the Walsh
spectrum oI (1,0,1,0,0,1,1,0).
The original Iunction can be expressed by means oI
its Walsh spectrum as an arithmetical polynomial.
Hadamard transform
From Wikipedia, the Iree encyclopedia
The Hadamard transform (also known as the Walsh-
Hadamard transform, Hadamard-Rademacher-Walsh
transform, Walsh transform, or Walsh-Fourier
transform) is an example oI a generalized class oI Fourier
transIorms. It perIorms an orthogonal, symmetric,
involutional, linear operation on real numbers (or
complex numbers, although the Hadamard matrices
themselves are purely real).
The Hadamard transIorm can be regarded as being built out
oI size-2 discrete Fourier transIorms (DFTs), and is in Iact
equivalent to a multidimensional DFT oI size
.
|2|
It decomposes an arbitrary
input vector into a superposition oI Walsh Iunctions.
The transIorm is named Ior the French mathematician
Jacques Hadamard, the German-American mathematician
Hans Rademacher, and the American mathematician Joseph
L. Walsh.
Contents
1 DeIinition
2 Quantum computing applications
2.1 Hadamard gate operations
3 Computational complexity
4 Other applications
5 See also
6 External links
7 ReIerences
Definition
The Hadamard transIorm H
m
is a 2
m
2
m
matrix, the Hadamard matrix (scaled by a normalization Iactor), that
transIorms 2
m
real numbers x
n
into 2
m
real numbers X
k
. The Hadamard transIorm can be deIined in two ways:
recursively, or by using the binary (base-2) representation oI the indices n and k.
7/8/2014 Hadamard transform - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Hadamard_transform 2/5
Recursively, we deIine the 1 1 Hadamard transIorm H
0
by the identity H
0
1, and then deIine H
m
Ior m ~ 0 by:
where the 1/\2 is a normalization that is sometimes omitted. Thus, other than this normalization Iactor, the
Hadamard matrices are made up entirely oI 1 and 1.
Equivalently, we can deIine the Hadamard matrix by its (k, n)-th entry by writing
and
where the k
f
and n
f
are the binary digits (0 or 1) oI k and n, respectively. Note that Ior the element in the top leIt
corner, we deIine: . In this case, we have:
This is exactly the multidimensional DFT, normalized to be unitary, iI the inputs and outputs are
regarded as multidimensional arrays indexed by the n
f
and k
f
, respectively.
Some examples oI the Hadamard matrices Iollow.
(This H
1
is precisely the size-2 DFT. It can also be regarded as the Fourier transIorm on the two-element aaaitive
group oI Z/(2).)
7/8/2014 Hadamard transform - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Hadamard_transform 3/5
where is the bitwise dot product oI the binary representations oI the numbers i and j. For example, iI ,
then , agreeing with the above (ignoring the overall
constant). Note that the Iirst row, Iirst column oI the matrix is denoted by .
The rows oI the Hadamard matrices are the Walsh Iunctions.
Quantum computing applications
In quantum inIormation processing the Hadamard transIormation, more oIten called Hadamard gate in this context
(cI. quantum gate), is a one-qubit rotation, mapping the qubit-basis states and to two superposition states
with equal weight oI the computational basis states and . Usually the phases are chosen so that we have
in Dirac notation. This corresponds to the transIormation matrix
in the basis.
Many quantum algorithms use the Hadamard transIorm as an initial step, since it maps n qubits initialized with to
a superposition oI all 2
n
orthogonal states in the basis with equal weight.
Hadamard gate operations
7/8/2014 Hadamard transform - Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Hadamard_transform 4/5
One application oI the Hadamard gate to either a 0 or 1 qubit will produce a quantum state that, iI observed, will be
a 0 or 1 with equal probability (as seen in the Iirst two operations). This is exactly like Ilipping a Iair coin in the
standard probabilistic model oI computation. However, iI the Hadamard gate is applied twice in succession (as is
eIIectively being done in the last two operations), then the Iinal state is always the same as the initial state. This
would be like taking a Iair coin that is showing heads, Ilipping it twice, and it always landing on heads aIter the
second Ilip.
Computational complexity
The Hadamard transIorm can be computed in n log n operations (n 2
m
), using the Iast Hadamard transIorm
algorithm.
Other applications
The Hadamard transIorm is also used in data encryption, as well as many signal processing and data compression
algorithms, such as JPEG XR and MPEG-4 AVC. In video compression applications, it is usually used in the Iorm
oI the sum oI absolute transIormed diIIerences. It is also a crucial part oI Grover's algorithm and Shor's algorithm in
quantum computing.
See also
Fast Walsh-Hadamard transIorm
Pseudo-Hadamard transIorm
Haar transIorm
Generalized Distributive Law
External links
Ritter, Terry (August 1996). "Walsh-Hadamard TransIorms: A Literature Survey"
(http://www.ciphersbyritter.com/RES/WALHAD.HTM).

You might also like