Professional Documents
Culture Documents
Two-Dimensional Orthogonal DCT Expansion in Triangular and Trapezoid Regions
Two-Dimensional Orthogonal DCT Expansion in Triangular and Trapezoid Regions
ABSTRACT
It is known that the 2-D DCT basis is complete and
orthogonal in a rectangular region. In this paper, we
introduce the way to generate the complete and
orthogonal 2-D DCT basis in a trapezoid region or a
triangular region without using the complicated GramSchmidt method. Moreover, since a polygon can be
decomposed several triangular regions, the proposed
method is also suitable for the polygonal region. Our
algorithm can much generalize the JPEG algorithm.
Instead of dividing an image into 8 by 8 blocks, we can
divide an image into trapezoid or triangular regions
and then transform and code each of them. In addition
to the DCT basis, our method can also be used for
generating the 2-D complete and orthogonal DFT
basis, KLT basis, Legendre basis, Hadamard (Walsh)
basis, and polynomial basis in the trapezoid and
triangular regions.
1st row
0th row
m=2
m=1
m=0
n=0
(b)
N 1
2
Region B
Region A
Region A
Region B
rotation by 180
Region B
Region A
Rectangular region
M N
2 2
M N
2 2
MN .
2
(17)
M 1 N
2 2
M 1 N
2 2
MN .
2
(18)
MN .
2
(19)
Fig. 4(a), we can first shear it into in Fig. 4(b), then use
the method in Section 2 to find the complete orthogonal
DCT bases, and then shear the bases back.
Furthermore, our method can also be applied for the
trapezoid regions that is the rotation form of Fig. 1 or
Fig. 4(a).
Moreover, since the triangular region can be viewed
as a special case of trapezoid region whose number of
pixel in the first (or the last) row is 1 (i.e., in (5), K(0) = 1
or K(M 1) =1), as in Fig. 5, thus, the method in Theorem
2 can also be used for the triangular region.
Furthermore, since an n-side polygonal region can be
view as a combination of n2 triangular regions, we can
also use our method to perform DCT expansion for a
polygonal region.
(b)
(a)
shearing
(a)
(b)
C0,0
2
C2,0
2
4
2 4 6 8 10
C3,3
4
2 4 6 8 10
C2,4
2 4 6 8 10
C1,5
C3,5
2 4 6 8 10
2 4 6 8 10
C0,6
C3,7
4
2 4 6 8 10
2 4 6 8 10
C1,7
4
2 4 6 8 10
2 4 6 8 10
C2,6
1st row
0th row
4
2 4 6 8 10
C0,4
2 4 6 8 10
C1,3
4
2 4 6 8 10
(M1)th row
4
2 4 6 8 10
C2,2
C3,1
2
4
2 4 6 8 10
C0,2
C1,1
4
2 4 6 8 10
2 4 6 8 10
process.
(a)
(b)
approximate
trapezoid
trapezoid
50
50
100
100
150
150
200
200
50
door
region
100
150
50
100
150
proposed
P[ j]
0.99
0.98 Gram-Schmidt
MPEG-4
0.97
0.96
10
15
20
25
Fig. 10: Normalized partial sums P(j) (see (24), which can
measure the performance of energy concentration)
using (a) the proposed method, (b) the DCT obtained
by the Gram-Schmidt method, and (c) the two
directional 1-D DCT in MPEG 4.
Although a door has the shape of rectangle, in a 2-D
image, it always becomes the trapezoid form, as in Fig.
9(b). Then we use three methods to transform and code
the door region in Fig. 9(b): (a) the proposed method, (b)
using the DCT basis orthogonalized by the Gram-Schmidt
method, and (c) applying the 1-D DCT along x-axis and
y-axis, as the method used in MPEG 4 [4]. Their running
time are:
(a) proposed: 0.0364 sec
(b) Gram-Schmidt: 1032.87 sec
(c) the 1-D DCT method in MPEG 4: 0.0701 sec. (23)
Then, in Fig. 10, we show the normalized partial sums
of the energies of the largest DCT coefficients of the three
methods:
,
From (23), the proposed method is much faster than
the Gram-Schmidt method and its energy concentration is
as good as the results of the Gram-Schmidt method (see
Fig. 10). Moreover, compared with the shape adaptive
DCT method in MPEG 4, since our method perform the
DCT with fixed number of points for each row and
column, our method has both less computation time and
better energy concentration than the 1-D DCT method in
MPEG 4.
(b)
50
all 88 rectangular blocks
trapezoidal, rectangular, or
triangular blocks
Fig. 11: (a) The existing JPEG cuts an image into several
88 rectangular blocks. (b) With the proposed
method, we can divide an image into rectangular,
trapezoid, or triangular blocks.
Compared with the original JPEG algorithm, the
method in Fig. 11 (b) is more flexible. Since the
boundaries between two blocks can have the direction not
parallel to x- and y-axes, we can make them match the
edges of the objects. Then, the YCbCr values in a block
will be more uniform, which is good for compression.
To make the block exactly match the shape of the
object, which is the work in MPEG-4, we need extra data
to record the edges of the objects, which is not good for
compression.
Using the method in Fig. 11 (b) can avoid the problem.
Since the boundary consists of straight lines, to record the
shape of a block, we only have to record its corners.
Moreover, from Section 4, since Theorem 2 can also
be used for deriving the 2-D complete and orthogonal
DFT, NTT, and Hadamard basis in a trapezoid or
triangular region, therefore, the proposed method is also
useful for signal analysis, filter design, CDMA, and other
signal processing applications.
5.3. Image Compression with proposed method
Chapter 5.1 shows the compression in a specific trapezoid
region. However, for general images we can hardly find a
trapezoid which can exactly match the shape of the object.
Therefore, finding the appropriate trapezoids is very
important in our proposed method.
Images are divided into four regions: lower frequency
regions, higher frequency regions, border regions and
the corner and boundaries part. The lower frequency
regions are trapezoids; they are depicted in Fig. 12(b). We
divide this image into eight low frequency parts. The lines
in Fig. 12(b) denote the boundaries of the trapezoid
region. Trapezoid DCT is used in the lower frequency
regions and the corner and boundaries part are coded by
geometric coding techniques. Arbitrary shape DCT using
Gaussian-Schmidt method is used in the higher frequency
regions and the border regions because their size are small
100
50
100
50
100
50
100
Fig. 12: (a) A fruit image. (b) The lower frequency regions
found in the fruit image.
We try to find the largest trapezoid that is contained inside
the lower frequency regions. Therefore, higher
compression ratio can be obtained in the compression
process. Dividing the objects into many trapezoid regions,
the optimal solution is difficult to find.
There are two problems in the dividing procedure:
overlapped trapezoids and missing points. Missing points
mean that we have gap between the trapezoids we found.
This can be dealt with pixel interpolation. The overlapped
trapezoids problem cause when we divide into larger
trapezoids. This can be easily remove by simply choosing
the average value or just drop one of the points. Missing
points may cause larger error so we are willing to process
more data (overlapped trapezoids problem) rather than
have missing points between the regions.
Fig. 13 is the flowchart of our proposed compression
method. An image is divided into four regions as we
mentioned before. The trapezoid DCT will be applied on
the low frequency region; in other words, the low
frequency regions must be divided into trapezoid. The
arbitrary shape DCT using GS is applied on the rest of the
regions.
Lower frequency
region
DCT in trapezoid
regions
Input
image
Coding
ASDCT using
GSO process
Other region
Coding
(b)
Finding inscribed
trapezoid
50
100
50
100
50
100
50
100
50
100
50
100