MMSP 2011 6093806

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Angular Intra Prediction in High Efficiency Video

Coding (HEVC)
Jani Lainema, Kemal Ugur
Nokia Research Center
Visiokatu 1, Tampere Finland
jani.lainema@nokia.com
kemal.ugur@nokia.com

Abstract—New video coding solutions, such as the HEVC of the reconstructed pictures, as well as make it possible to
(High Efficiency Video Coding) standard being developed by implement the scheme with very small code size, memory
JCT-VC (Joint Collaborative Team on Video Coding), are footprint and computational requirements. The increased
typically designed for high resolution video content. Increasing coding efficiency is mainly attributed to the increased number
video resolution creates two basic requirements for practical
of prediction directions allowing reconstruction of different
video codecs; those need to be able to provide compression
efficiency superior to prior video coding solutions and the directional structures with high accuracy. The set of available
computational requirements need to be aligned with the prediction directions is selected in a way the angle between
foreseeable hardware platforms. This paper proposes an intra the directions is roughly constant providing a consistent
prediction method which is designed to provide high compression support for structures with different angularity. Also the set of
efficiency and which can be implemented effectively in resource available prediction directions depends on the prediction
constrained environments making it applicable to wide range of block size considering different rate-distortion (RD) behaviour
use cases. When designing the method, special attention was and characteristics of blocks of different size. The method
given to the algorithmic definition of the prediction sample introduced in this paper consists of a combination of desired
generation, in order to be able to utilize the same reconstruction
aspects proposed by Nokia et.al. [2][3], Samsung et.al. [4][5]
process at different block sizes. The proposed method
outperforms earlier variations of the same family of technologies and DOCOMO [6] to JCT-VC and it has been adopted as the
significantly and consistently across different classes of video directional intra prediction method for the draft HEVC
material, and has recently been adopted as the directional intra standard [7].
prediction method for the draft HEVC standard. Experimental This paper is organized as follows: Section 2 presents the
results show that the proposed method outperforms the intra prediction in H.264/AVC video coding standard. Section
H.264/AVC intra prediction approach on average by 4.8 %. For 3 introduces the details of the proposed method and discusses
sequences with dominant directional structures, the coding the improvements over previous solutions. Section 4 presents
efficiency gains become more significant and exceed 10 %. detailed experimental results and Section 5 concludes the
I. INTRODUCTION paper.
The two prominent international video coding II. INTRA PREDICTION IN H.264/AVC
standardization organizations, namely ITU-T Video Coding H.264/AVC video coding standard utilizes directional intra
Experts Group (VCEG) and ISO/IEC Moving Picture Experts prediction to exploit the spatial redundancy present in video
Group (MPEG) have formed the Joint Collaborative Team on and still pictures in order to improve coding efficiency of the
Video Coding (JCT-VC) entity in April 2010. Since then, codec. This is done by extrapolating the predicted sample
JCT-VC has been working towards definition of a next values from reconstructed samples directly above and to the
generation video coding standard called High Efficiency left of the block to be processed. In order to be able to
Video Coding (HEVC). The major goal of the HEVC standard represent structures with various directional properties,
is to achieve significant improvements in coding efficiency H.264/AVC defines up to nine different prediction modes for
compared to H.264/AVC [1], especially when operating on a given block. The maximal set of modes includes eight
high resolution video content. Complexity of the HEVC directional predictions and a mode predicting the block with
standard is also carefully considered in the development the average (DC) value of the reference pixels. Fig. 1 illustrates
process in order to make it possible to enable high resolution, the intra prediction methods available in H.264/AVC for
high quality video applications in resource constrained blocks of size 4x4 pixels. As seen in Fig. 1, different
devices, such as tablets and mobile phones. directionalities are supported so that video encoders could
This paper introduces a new intra prediction method choose the mode that provides the best RD performance. For
developed for high efficiency image and video coding. The example, if the image block that is coded exhibits a strong
algorithm is based on well know directional intra prediction vertical structures, such as vertical stripes, the prediction
methods widely utilized in existing video codecs, such as mode 0 (vertical mode) would most likely give better
H.264/AVC. The enhancements provided by the proposed compression capability than the other modes.
approach increase compression efficiency and visual quality
It should be noted that the prediction directions shown in
Fig. 1are available also for 8x8 blocks, while blocks of size
16x16 pixels are predicted using only four available modes [1].

Fig. 1. Intra prediction modes in H.264/AVC

Fig. 2. Available prediction directions in the proposed method when number


of directions is set to 33. The projection displacement parameter d can have
III. ANGULAR INTRA PREDICTION IN HEVC values from -32 to +32 corresponding to different angularities as shown in the
It was observed that the 9 intra prediction modes supported picture.
in H.264/AVC with different directionalities is not flexible
enough to represent complex structures or image segments
where Ri is the ith reference sample on the reference row,
with different directionalities. To mitigate this, the proposed
Ri+1 is the consecutive reference sample and wy is the
method extends the set of directional prediction modes of
weighting between the two reference samples corresponding
H.264/AVC providing increased flexibility and more accurate
to the projected sub-pixel location in between Ri and Ri+1.
predictions for the sample values. The increased prediction
Reference sample index i and weighting parameter wy are
accuracy provides significant reductions in residual energy of
calculated based on the projection displacement d associated
the intra coded blocks and improvements in coding efficiency.
with the selected prediction direction (describing the tangent
The proposed method provides an efficient way to support a
of the prediction direction in units of 1/32 sample and having
large number of prediction directions with low computational
a value from -32 to +32 as shown in Fig. 2) as follows:
requirements in a codec with variable block size. The basic
idea is to define the prediction process by a displacement
cy = (y·d) >> 5
measure between one line of pixels and a reference line of
wy = (y·d) & 31 (2)
pixels at a given granularity (either in horizontal or vertical
i = x + cy
direction). When deriving predicted values for each sample in
the block, the selected displacement is used to calculate
In the above equations >> denotes bit shift operation to
projection of each sample to the line of reference samples and
right and & denotes bitwise AND operation. It should be
an interpolation operation is applied to calculate the final
noted that cy and wy parameters both depend only on the
predicted sample values utilizing the distance of the projected
coordinate y and the selected prediction displacement d. Thus,
pixel locations from the closest reference pixels. Fig. 2
both parameters remain constant when calculating predictions
illustrates the prediction directions available when utilizing 33
for one line of samples within the prediction block as
directions.
illustrated in Figure 3. This makes the sample prediction
The sample prediction process is described below in detail.
process to have very low computational requirements as in
Each predicted sample Px,y is obtained by projecting its
order to derive the predicted value for a specific sample only
location to the reference row of pixels applying the selected
Equation 1 needs to be evaluated. When the projection points
prediction direction and interpolating a value for the sample at
to integer samples (i.e., when wy equals to zero), the process is
1/32 pixel accuracy utilizing linear interpolation between the
even simpler and consists of only copying integer reference
two closest reference samples as shown in Equation 1 below.
samples from the reference row.
Px,y = ( (32 – wy)·Ri + wy·Ri+1 + 16) >> 5 (1)
defined similarly to the DC prediction in H.264/AVC.
TABLE I
NUMBER OF INTRA PREDICTION DIRECTIONS IN PROPOSED METHOD

Prediction Block Size Number of directions


4x4 16
8x8 33
16x16 33
32x32 33
64x64 2

As the number of available intra prediction directions in the


proposed method is relatively large, an encoder algorithm
Figure 3. An example of angular prediction when operating on the sixth row
based on full RD optimization is not practical in most use
of an 8x8 block with vertical prediction that utilizes a positive displacement cases. In our experiments we have used the HM 3.0 encoder
value. Triangles indicate the reference pixels and circles indicate the projected algorithm [8] which computes sum of absolute differences
fractional pixels at 1/32 pixel accuracy. (SAD) for all the available prediction alternatives and selects
only a subset of directions for full RD evaluation determining
The origin of the coordinate system for the equations above the prediction mode for the block.
is set to the top-left reference sample immediately above and
IV. EXPERIMENTAL RESULTS
to the left from the top-left corner of the block to be
processed. Equations 1 and 2 define how the predicted sample In order to compare coding performance of the proposed
values are obtained in the case of vertical prediction when the method to that of the H.264/AVC intra prediction, the eight
reference row above the block is used to derive the prediction. H.264/AVC directional prediction modes were integrated to
Prediction from the left reference column is derived the HM 3.0 software for block sizes from 4x4 to 32x32. The
identically by swapping the x and y coordinates in Equations 1 same encoding algorithm, block sizes and codec configuration
and 2, and using Ri as indexes to the left reference column were used for both proposed method and H.264/AVC
instead of the top reference row. approach with only difference being the definition and number
In some cases the projected pixel locations would fall of prediction directions in the case of H.264/AVC method.
outside of the selected reference row (in the case of vertical The simulations were run by following the JCT-VC common
prediction) or column (in the case of horizontal prediction). In test conditions and software reference configurations [10]
these cases the reference row or column is extended by which are characterized as follow:
projecting the left reference column to extend the top
reference row towards left, or projecting the top reference row 1. Two sets of simulations were performed, one for high
to extend the left reference column upwards in the case of efficiency and one for low complexity configuration of
vertical and horizontal predictions, respectively. Using this the codec. The main differences between the two are
kind of approach was found to have negligible effect for the entropy coders used, presence of the adaptive loop
compression performance, but help reduce the complexity filter (ALF) and the processing bit-depth. The high
over an alternative approach of always projecting pixels to the efficiency configuration in HM 3.0 uses context
closest reference row or column [12]. adaptive binary arithmetic coding (CABAC), ALF and
It should be noted that increasing the number of directions processes video assuming 10 bit sample fidelity.
improves the prediction accuracy. However, increased number Whereas, the low complexity configuration uses
of directions also means that indicating the intra prediction context adaptive variable length coding (CAVLC),
information would require larger number of bits. In order to utilizes no ALF and processes 8 bit video.
optimize the RD performance of the codec, extensive 2. 20 test sequences were used for both configurations.
simulations have been performed to find the optimum number The sequences have varying characteristics and their
of prediction directions [8]. Table 1 shows the number of intra resolutions range from 416 x 240 to 2560 x 1600 pixels.
prediction directions that can be signalled for each prediction 3. All the frames of the sequences were coded as intra
block size in the proposed method. The reason for reduced pictures.
number of directions for 4x4 blocks is that the improvement in 4. The coding efficiency difference between the two
prediction quality provided by the extra directions is not able approaches was evaluated using the well known
to offset the associated bitrate overhead. When it comes to the Bjontegaard-Delta measure [11].
blocks of size 64x64 pixels, it was found that those are usually
selected only in extremely smooth areas and availability of The experimental results for each sequence are shown in
additional prediction directions is not justified in RD sense. Table II. In addition, Fig. 4 illustrates the performance of the
In addition to the directional modes described above the proposed method for one of the test sequences. By examining
proposed method also utilizes DC prediction which has been the experimental results, following conclusions could be
reached:
1. The proposed method improves coding efficiency on
average by 4.8 % for a wide range of typical video
content. The gains become larger for sequences that
exhibit strong directional structures and reach 10.5 %
demonstrating the benefit of additional flexibility
provided by the proposed method.
2. The gains observed in the high efficiency tests are
slightly smaller than those in the low complexity tests,
which can be explained by two factors. The ALF in
high efficiency HM 3.0 configuration can remove
some errors due to inaccuracies in prediction and
compensate some drawbacks of the low accuracy intra
prediction. Similarly, the CABAC entropy coding can
code the remaining residual error more efficiently than Fig. 4. Rate-distortion performance of the proposed method for the sequence
CAVLC, offsetting some inefficiency. Basketball Drill in low complexity configuration
3. The proposed method provides coding efficiency gains
for all the sequences tested and the gains also appear In addition to the objective coding efficiency gains, the
consistent across different video resolutions and proposed method also provides significant improvements in
content categories. subjective quality, especially for image segments with strong
TABLE II directional structures. The reason for that could be explained
EXPERIMENTAL RESULTS OF THE PROPOSED ALGORITHM as follows. If the image segment exhibits strong directional
component and intra prediction is not accurate enough to
Resolution Sequence Configuration compensate that, the residual signal also exhibits strong
HE LC directional structure. When residual signal with that kind of
2560x1600 Traffic -4.1 -5.2 characteristics is transform coded, the directional components
PeopleOnStreet -4.5 -5.5 in the residual signal will cause annoying ringing artifacts.
Nebuta -1.6 -1.5
However, by improving the accuracy of the prediction, these
SteamLocomotive -1.5 -2.0
directional structures in the residual signal are more efficiently
1080p Kimono -2.4 -2.7
ParkScene -1.0 -1.6 compensated and the ringing effects are reduced significantly.
Cactus -5.9 -6.9 The improvement in visual quality is demonstrated in Fig. 5
BasketballDrive -7.6 -9.0 for the Basketball Drill sequence. As seen in Fig. 5, usage of
BQTerrace -5.9 -7.3 the H.264/AVC prediction directions results in ringing
832x480 BasketballDrill -8.6 -9.6 artifacts around the directional edges, whereas significantly
BQMall -4.0 -5.6 less ringing is observed in the reconstructed image with the
PartyScene -1.4 -1.8 proposed method.
RaceHorses -3.3 -3.6
416x240 BasketballPass -4.5 -5.5
BQSquare -3.1 -3.4
BlowingBubbles -2.3 -2.8
RaceHorses -3.6 -4.0
720p Vidyo1 -8.7 -10.5
Vidyo3 -5.6 -7.5
Vidyo4 -5.8 -7.2
Min/Max Smallest gain -1.0 -1.5
Largest gain -8.7 -10.5
Averages 4Kx2K -2.9 -3.6
1080p -4.6 -5.5
832x480 -4.3 -5.1
416x240 -3.4 -3.9
720p -6.7 -8.4
Average (all) -4.3 -5.2

Fig. 5. Example of using angular prediction (right) to improve reconstruction


of directional components in reconstructed video (reference on the left using
the H.264/AVC prediction directions at 9.5 % higher bitrate)
V. CONCLUSION [3] K. Ugur, K. R. Andersson, A. Fuldseth, “Video coding technology
proposal by Tandberg, Nokia, and Ericsson”, JCTVC-A119, Dresden,
The flexible angular intra prediction method introduced in Germany, 15-23 Apr. 2010.
this paper provides significant improvements in both objective [4] W.-J. Han, et.al., “Improved video compression efficiency through
and subjective quality of compressed video and still pictures. flexible unit representation and corresponding extension of coding
tools,” IEEE Transactions on CSVT, vol. 20, no. 12, pp. 1709-1720,
While demonstrating compression efficiency superior to Dec. 2010.
previous solutions, computational requirements and related [5] K. McCann, et.al., “Video coding technology proposal by Samsung
implementation aspects have been taken into account by (and BBC)”, JCTVC-A124, Dresden, Germany, 15-23 Apr. 2010.
designing the algorithm considering characteristics of various [6] Frank Bossen, TK Tan, Junya Takiue, “Simplified angular intra
prediction,” JCTVC-B093, Geneva, Switzerland, Jul. 2011.
implementation environments and use cases. The proposed [7] Joint Collaborative Team on Video Coding, “WD3: Working Draft 3
method has passed stringent evaluation by the JCT-VC of High-Efficiency Video Coding,” JCTVC-E603, Geneva,
community and has been adopted to the draft HEVC standard. Switzerland, Mar. 2011.
JCT-VC community is in the process of evaluating further [8] Kazuo Sugimoto, “CE10: Summary of CE10 on number of intra
prediction directions,” JCTVC-D100, Daegu, Korea, Jan. 2011.
improvements to the algorithm, including usage of rectangular [9] Joint Collaborative Team on Video Coding (Apr. 2011), HEVC
(non-square) prediction blocks and enhanced entropy coding reference software “HM 3.0” [online] https://hevc.hhi.fraunhofer.de/
possibilities for the prediction modes. svn/svn_HEVCSoftware/tags/HM-3.0
[10] F. Bossen, “Common test conditions and software reference
ACKNOWLEDGMENTS configurations,” JCTVC-E700, Geneva, Switzerland, Mar. 2011.
[11] G. Bjøntegaard, “Calculation of average PSNR differences between
The authors wish to acknowledge the experts in JCT-VC RD-Curves,” ITU-T SG16 Q.6 Document, VCEG-M33, Austin, Apr.
community for a number of inspiring discussions and the 2001.
fruitful collaborative spirit. [12] TK Tan, M. Budagavi, J. Lainema, “Summary Report for TE5 on
Simplification of Unified Intra Prediction,” JCTVC-C046, Guangzhou,
China, Oct. 2010.
REFERENCES
[1] “Advanced video coding for generic audiovisual services”, ITU-T
Recommendation H.264, Mar. 2010.
[2] K. Ugur, et.al., “High performance, low complexity video coding and
the emerging HEVC standard,” IEEE Transactions on CSVT, vol. 20,
no. 12, pp. 1688-1697, Dec. 2010.

You might also like