Professional Documents
Culture Documents
A New Low-Complexity Integer Distortion Estimation Method For H.264/AVC Encoder
A New Low-Complexity Integer Distortion Estimation Method For H.264/AVC Encoder
Abstract—In this paper, a new low-complexity distortion and [13], the distortion was estimated nonstatistically and the
estimation method for the H.264 rate-distortion optimized mode estimation accuracy of these methods was better than that of
decision is proposed. Coding processes, such as discrete cosine
[12]. Tu [11] removed the IDCT and reconstruction process
transform (DCT), quantization, inverse quantization, inverse
DCT, and reconstruction, are needed to compute the distortion and Po [13] used an iterative table look-up quantization and
in an H.264 encoder. To reduce these computations, we estimate inverse quantization during the distortion estimation. These
the distortion using coefficients calculated in the quantization distortion estimation methods [11]–[13] use noninteger oper-
process and eliminate the inverse quantization, inverse DCT, and ations, which are difficult to implement in hardware.
reconstruction processes. In the proposed method, the distortion
In this paper, we propose a new low-complexity distortion
is computed by integer operations, which is more efficient for the
hardware implementation. The simulation results show that the estimation method which uses only integer operations. This is
proposed method can reduce the encoding time by about 10% one of the nonstatistical methods. A new quantization step-
with negligible degradation in the coding performance. size is proposed to increase the estimation accuracy, and the
Index Terms—Distortion, H.264/advanced video coding, inverse distortion is estimated by integer operations with coefficients
discrete cosine transform, inverse quantization, rate-distortion which are obtained in the Q process. Therefore, the IQ, IDCT,
optimization mode decision. and reconstruction processes are not needed in the distortion
computation. The proposed method can also be combined
I. Introduction with bit-rate estimation algorithms [9]–[11] and mode skipping
.264/ADVANCED video coding (AVC) is the latest methods [6]–[8] to further reduce the encoding complexity.
H video coding standard, reducing bit-rate costs by about
50% compared to the MPEG-4 simple profile [1], [2]. II. Background
H.264/AVC employs the rate-distortion optimization (RDO) A. Rate-Distortion Optimized Mode Decision in H.264
technique to obtain the best coding performance [3], [4]. In H.264, there are various modes whose block sizes are
However, the computational complexity of RDO reaches one- in the range from 16 × 16 to 4 × 4. In a low-complexity
third of that of the total encoding process when the fast motion RDO (LC RDO), the mode which has the minimum residual
estimation (FME) is used [2], [5]. To reduce the computa- between the predicted block and the block to be encoded
tional complexity of the RDO mode decision, a number of is selected [3], [4], [14]. To improve the encoding perfor-
methods have been proposed [6]–[13]. In [6]–[8], fast mode mance, the bit-rate and distortion of the residual block can
decision methods achieved by skipping unnecessary modes be considered in a high-complexity RDO (HC RDO) [3],
were developed. To reduce the complexity of calculating the [4], [14]. The number of candidate modes of HC RDO can
rate-distortion cost, the bit-rate was estimated using quantized be reduced by a fast high-complexity RDO (FHC RDO) [3],
discrete cosine transform (DCT) coefficients in [9]–[11]. In [14]. In HC RDO and FHC RDO, the bit-rate is generated
general, the distortion estimation method is more complex after the entropy coding of the quantized transform coefficients
than bit-rate estimation, so a number of distortion estimations and the header information, and the distortion is computed as
in the transform domain were proposed [11]–[13]. In [12], the sum of the squared differences (SSD) of the original and
the distortion is estimated by statistical modeling of the reconstructed blocks. The reconstruction distortion DRec for
quantization error. In this algorithm, the quantization (Q), 4 × 4 blocks is defined as follows [3], [4]:
inverse quantization (IQ), inverse discrete cosine transform 3
3
(IDCT), and reconstruction processes are not necessary. In [11] DRec =
S (x, y) − Ŝ (x, y)
2
. (1)
Manuscript received August 3, 2008; revised December 5, 2008 and March x=0 y=0
30, 2009. First version published September 1, 2009; current version published
February 5, 2010. This work was supported by the 2nd-phase Brain Korea In this equation, S (x, y) and Ŝ (x, y) denote the xth and yth
21 Program, funded by the Ministry of Education, Korea. This paper was pixels of the original and reconstructed blocks, respectively.
recommended by Associate Editor Z. He.
The authors are with the Department of Electronics and Electrical Engi- The computation process of the reconstruction distortion is
neering, Pusan National University, Busan 609-735, South Korea (e-mail: shown in Fig. 1, in which S, P, and Ŝ are the 4 × 4 original,
moonjmee@pusan.ac.kr; jhkim@pusan.ac.kr). predicted, and reconstructed blocks, respectively. As shown
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. in Fig. 1, DCT, Q, IQ, IDCT, and the reconstruction of the
Digital Object Identifier 10.1109/TCSVT.2009.2031389 residual block U are needed to obtain DRec .
c 2010 IEEE
1051-8215/$26.00
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.
208 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 2, FEBRUARY 2010
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.
MOON AND KIM: A NEW LOW-COMPLEXITY INTEGER DISTORTION ESTIMATION METHOD FOR H.264/AVC ENCODER 209
TABLE I
(a) MF (r, k, l) and (b) mf (r, k, l)
(k, l) r
0 1 2 3 4 5
(k, l) ∈ G0 13 107 11 916 10 082 9362 8192 7282
(k, l) ∈ G1 5243 4660 4194 3647 3355 2893
(k, l) ∈ G2 8066 7490 6554 5825 5243 4559
(a)
(k, l) r
0 1 2 3 4 5
(k, l) ∈ G0 13 107.2 11 915.64 10 082.46 9362.29 8192 7281.79
(k, l) ∈ G1 5242.88 4766.25 4032.99 3744.91 3276.8 2912.71
(k, l) ∈ G2 8289.72 7536.11 6376.71 5921.23 5181.08 4605.4
(b)
The inverse quantized DCT coefficient V̂ (k, l) in this equation Z (k, l) and Vq (k, l). To get more simplified equation, we try
can be calculated by Vq (k, l) · (QP) without the IQ of the following expansion using (13) and (15), as follows:
H.264. Before discussing the distortion, we denote the right 3
3
2
1
terms of (7) as mf (r, k, l), and list the values of MF (r, k, l) Z (k, l) · MF (r, k, l) · 2q+15
− Vq (k, l)
DTD =
and mf (r, k, l) in Table I. From this table, we observe that ·PM (QP, k, l)
k=0 l=0
MF (r, k, l) and mf (r, k, l) are different in (k, l) ∈ G1 and (18)
(k, l) ∈ G2. This means that a modification of (QP) is and rewrite this using (10) and (15) as
needed to compensate for these differences before obtaining 3
3
V̂ (k, l). Therefore, we rewrite (7) and (8) as DTD =
k=0 l=0
Ef (k, l) q+15 Ef (k, l)
1
2
MF (r, k, l) = ·2 · α (k, l) = · 2q+15 Zq (k, l) · − Vq (k, l) · δPM (r, k, l) · 2q
(QP) (QP) /α (k, l) 2q+15
(13)
3
3
1 = W (k, l)2 · δPM (r, k, l)2 ,
RF (r, k, l) = (QP) · q · Ei (k, l) · 26 · β (k, l)
2 k=0 l=0
(QP) 1 (19)
= · · Ei (k, l) · 26 (14)
α (k, l) 2q
where
where β (k, l) = 1/α (k, l). From (13) and (14), we define a 1
W (k, l) = Zq (k, l) · − Vq (k, l) · 2q .
new quantization step size PM (QP, k, l) as follows: 215
In this equation, W (k, l) represents the distortion caused by
(QP) δ (r) the q-bit right shift in the quantization of (10). Then, scaling
PM (QP, k, l) = = · 2q = δPM (r, k, l) · 2q .
α (k, l) α (k, l) with δPM (r, k, l), which is the step-size for r = 0,. . . , 5, DTD ,
(15) represents the distortion by quantization for all QPs. This
Note that (QP) as a function of QP is designed as a equation has two advantages. First, the factor δPM (r, k, l)2 can
function of QP, k, l. The δPM (r, k, l) of (15) can be calculated be multiplied once for each G0, G1, and G2 group after
using (13) as the sum of squared W (k, l) is calculated, so the number of
⎧ a2 ·215 multiplications can be reduced. Second, W (k, l) can be imple-
, for (k, l) ∈ G0 mented using binary shift operations with integer coefficients
⎪ MF (r,k,l)
⎪
⎪
Ef (k, l) · 215 ⎨ (b2 /4)·215 calculated by quantization. The proposed integer distortion is
δPM (r, k, l) = = MF (r,k,l)
, for (k, l) ∈ G1
MF (r, k, l) ⎪
⎪ implemented by integer approximations of (19) as follows:
⎩ (ab/2)·2
⎪ 15
, for (k, l) ∈ G2.
3 3
MF (r,k,l)
2
(16) DPM = |WI (k, l)| · Qs (r, k, l) ≫ 8,
The distortion of (12) can be represented using the proposed k=0 l=0
step-size PM (QP, k, l) as |WI (k, l)| = {(|Zq (k, l)| + 16384) ≫ 15} − (|Vq (k, l)| ≪ q) ,
Qs (r, k, l) = round (δPM (r, k, l) · 16)2
(20)
3
3
2 where round (·) indicates the rounding operation. Qs (r, k, l),
DTD = Z (k, l) · Ef (k, l) −Vq (k, l) · PM (QP, k, l) .
k=0 l=0
which has a maximum of 9 bits, is listed in Table II. The
(17) maximum number of bits for q is 8, so |WI (k, l)| has a
In this equation, multiplications by the factor Ef (k, l) and maximum of 8 bits. Therefore, the proposed DPM can be
PM (QP, k, l), respectively, are required for each coefficient computed by 16-bit integer operations. Fig. 3 shows the
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.
210 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 2, FEBRUARY 2010
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.
MOON AND KIM: A NEW LOW-COMPLEXITY INTEGER DISTORTION ESTIMATION METHOD FOR H.264/AVC ENCODER 211
TABLE II
Squared Integer Basic Quantization Step-Size Qs(r, k, l)
(k, l) r
0 1 2 3 4 5
(k, l) ∈ G0 100 121 169 196 256 324
(k, l) ∈ G1 100 127 156 207 244 329
(k, l) ∈ G2 106 122 160 203 250 331
TABLE III
Comparison of the Rate-Distortion Performance Between the H.264 Encoder and the Proposed Method
TABLE IV
Comparison of Tu’s, Po’s, and the Proposed Method
with 16-bit integer operations using coefficients obtained by simulation results show that the differences between the re-
quantization, so the IQ of H.264 is not used. construction distortion and the proposed distortion are small
enough to be disregarded for all QPs. For quantization pa-
rameters 24–36, the proposed method can reduce the encoding
V. Conclusion time by about 10% with negligible degradation in the coding
performance. To further reduce the computational complexity,
In this paper, we have proposed a low-complexity distortion
the proposed distortion estimation method can be combined
estimation method which relies on integer operations. The re-
with other fast mode decision methods.
construction distortion in H.264 is calculated with the original
and reconstructed blocks. To reduce the number of compu-
tations, we estimated the distortion in the DCT domain with References
16-bit integer operations, and remove the need for the inverse [1] Draft ITU-T Recommendation and Final Draft International Standard of
quantization, inverse DCT, and reconstruction processes. The Joint Video Specification (ITU-T Rec. H.264 ISO/IEC 14496-10 AVC),
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.
212 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 20, NO. 2, FEBRUARY 2010
document JVT-G050.doc, Joint Video Team (JVT) of ISO/IEC MPEG [13] L. M. Po and K. Guo, “Transform-domain fast sum of the squared
and ITU-T VCEG, Mar. 2003. difference computation for H.264/AVC rate-distortion optimization,”
[2] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 6, pp. 765–773,
T. Stockhammer, and T. Wiegand, “Video coding with H.264/AVC: Jun. 2007.
Tools, performance, and complexity,” IEEE Circuits Syst. Mag., vol. [14] Joint Video Team (JVT) Reference Software [Online]. Available:
4, no. 1, pp. 7–28, Jan.–Mar. 2004. http://bs.hhi.de/∼suehring/tml/download/
[3] Text Description of Joint Model Reference Encoding Methods and [15] H. S. Malver, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-
Decoding Concealment Methods, document JVT-K049.doc, Joint Video complexity transform and quantization in H.264/AVC,” IEEE Trans.
Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Mar. 2004. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 598–603, Jul. 2003.
[4] T. Wiegand, H. Schwarz, A. Joch, F. Kossentini, and G. J. Sullivan, [16] G. Bjontegaard, “Calculation of average PSNR differences between
“Rate-constrained coder control and comparison of video coding stan- RD-curves,” in Proc. 13th Video Coding Experts Group Meeting, docu-
dards,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, ment VCEG-M033. Austin, TX, Apr. 2001.
pp. 688–703, Jul. 2003.
[5] L. E. G. Richardson, “Transform and quantisation” and “Computational
performance,” in H.264 and MPEG-4 Video Compression. New York: Jeong Mee Moon (M’09) received the M.S. and
Wiley, 2003, ch. 6.4.8, pp. 187–198, ch. 7.4.4, pp. 254–255. Ph.D. degrees in electronics engineering from the
[6] F. Pan, X. Lin, S. Rahardja, K. P. Lim, Z. G. Li, D. Wu, and S. Pusan National University, Busan, South Korea, in
Wu, “Fast mode decision algorithm for intraprediction in H.264/AVC 2003 and 2008, respectively.
video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, From 2008 to 2009, she was a Researcher at
pp. 813–822, Jul. 2005. the Research Institute of Computer Information and
[7] L. Yang, K. Yu, J. Li, and S. Li, “An effective variable block-size early Communication, Pusan National University. She is
termination algorithm for H.264 video coding,” IEEE Trans. Circuits currently a Senior Engineer with the R&D Center,
Syst. Video Technol., vol. 15, no. 6, pp. 784–788, Jun. 2005. Chips&Media, Seoul, South Korea. Her research
[8] H. Wang, S. Kwong, and C. W. Kok, “An efficient mode decision algo- interests include image processing, video codec, and
rithm for H.264/AVC encoding optimization,” IEEE Trans. Multimedia, related SOC design.
vol. 9, no. 4, pp. 882–888, Jun. 2007.
[9] Q. Chen and Y. He, “A fast bits estimation method for rate-distortion
optimization in H.264/AVC,” in Proc. Picture Coding Symp., paper no. Jae Ho Kim received the M.S. and Ph.D. degrees
35. San Francisco, CA, Dec. 2004, pp. 133-134. in electronics engineering from the Korea Advanced
[10] M. G. Sarwer and L. M. Po, “Fast bit rate estimation for mode decision Institute of Science and Technology, Daejeon, South
of H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. Korea, in 1982 and 1990, respectively.
10, pp. 1402–1407, Oct. 2007. From 1988 to 1992, he was with the Laboratory
[11] Y. K. Tu, J. F. Yang, and M. T. Sun, “Efficient rate-distortion estimation of Visual Communication, Samsung Electronics,
for H.264/AVC coders,” IEEE Trans. Circuits Syst. Video Technol., vol. Suwon, South Korea. He is currently a Professor
16, no. 5, pp. 600–611, May 2006. in the Department of Electronics and Electrical En-
[12] J. M. Moon, Y. H. Moon, and J. H. Kim, “A computation reduction gineering, Pusan National University, Busan, South
method for RDO mode decision based on an approximation of the Korea. His research interests include graphics, image
distortion,” in Proc. IEEE Int. Conf. Image Process., Atlanta, GA, Oct. processing, video coding and related VLSI design,
2006, pp. 2481–2484. and image communication.
Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:4502 UTC from IE Xplore. Restricon aply.