MLT Report Vfinal2

Modulated Lapped Transform based image compression
Course : Data Compression Students: Paula Popa Cosmin Caba
2011
Table of Contents
1. 2. 3. 4.
Introduction ...................................................................................................................................... 3 Transform block ................................................................................................................................ 4 Quantization block ............................................................................................................................ 7 Entropy Encoding Block .................................................................................................................... 9 4.1 Entropy Encoding Part 1 : Huffman Encoding ................................................................................. 9 4.2 Entropy Encoding Part 2................................................................................................................ 10
5.
Measurements ................................................................................................................................ 12 5.1. Code length measurements and entropy estimation .................................................................. 12 5.2. Objective and Subjective assessment .......................................................................................... 15
6. 7. 8.
Conclusion ....................................................................................................................................... 19 Bibliography .................................................................................................................................... 20 Appendix ......................................................................................................................................... 21
1. Introduction
The general purpose of the project is to implement a coder and a decoder and to test the implemented modules on some test data. The implementation will follow the block diagram presented in figure 1:
Input image
Binary bit stream
Q
Figure 1:Block diagram of encoder
The first module is the Transform block. The purpose of this block is to decorrelate the image . The second block includes both Quantization of the transformed matrix and the ZigZag traversing of the quantized matrix. The result obtained after the ZigZag is coded as a run length vector. The last block is Entropy encoding module. The aim of this block is to code, lossless, the run length vector into the final bit stream. More details about how this techniques have been implemented are going to be given in the next chapters. After the encoding and decoding process have been performed we will evaluate the quality of the reconstructed image in comparison to the original one. Also the report will comprise several measurements related to the code length and entropy estimation. We have chosen to work with Modulated Lapped Transform (MLT) in order to decorrelate the initial image. This decision is based on the fact that, as it has been proven in many articles, lapped transforms perform very good in reducing the blocking effect of the reconstructed image. For Quantization we have chosen to work with a standard matrix quantization, similar to JPEG standard. For the Entropy encoding Huffman algorithm is used. We have to mention that the goals of this project is to assess the coding in terms of compression ratio, coding rate and how close is the reconstructed image to the original one( objective and subjective assessment). We do not intend to measure the performance of the implemented algorithms like speed, complexity or memory required, therefore the implementation is not focused on optimizing the algorithms but only on providing the right results. The structure of the report is : first we will describe the coding principle according to the diagram in figure 1, then the implementation is presented and the tests that were effectuated, afterwards the results will discussed and finally the conclusions.
2. Transform block
As previously mentioned the transform used in this block in the Modulated Lapped Transform. MLT is basically a block transform and it operates on the image similar to DCT transform. Lapped transform is considered an extension to the regular block transform. The difference is that for lapped transforms the basis functions are longer than for DCT leading to overlapping between neighboring blocks. Also the basis functions for MLT decay to 0 very smoothly thus reducing the block artifacts. We have decided to use an 8x16 transform matrix. This is a compromise between quality and complexity. Next we will show how the transformation matrix has been built. Let N be the number of basis functions in the matrix and L be the length of one function. Therefore we have N=8 and L=16 for our case. We denote the matrix P and the elements of the matrix are computed as follows: where k=1..8 and n=1..16. h(n) is seen as a window that shapes (modulates ) the basis functions in such a way that both ends have a continuous transition to 0. h(n) is defined :
Next picture depicts the basis functions computed with the previous formulas for the case of N=8 and L=16:
Figure 2: MLT :N=8 ,L=16 basis functions The obtained functions (figure 2) are similar to those provided in references [1] and [2]. 4
In order to be able to use MLT in image compression and not only , we have to prove that perfect reconstruction can be achieved. The purpose of the block transform is to decorrelate the image without any loss in information. Basically the transform has to be orthogonal. In [2 ], there has been proven that by using these basis functions, perfect reconstruction can be achieved. We will not present the mathematical demonstration but just a few graphs to illustrate how we have achieved perfect reconstruction in our implementation.
Figure 3: Original Lenna (left side) and reconstructed image (right side) Although to the human eye these two images seem identical, there are some imperfections in similarity in a sense that at the borders we did not get perfect reconstruction. We have plotted in the next figure the difference between the original and reconstructed image :
Figure 4: Difference between Lena and the reconstruction The blue color in the middle of the graph tells us that the difference is 0 while the colored lines on the border illustrate the level of the difference between the pixel values in the two situations. The problem with the implementation we provide is that on the borders perfect reconstruction cant be 5
achieved because of the distortion that the transform introduces in the image. Only 4 pixels at every of the four edges are affected by this distortion. We have tried to eliminate or at least to reduce as much as possible the effect of the distortion but this is the best we got. The technique used is to mirror 4 of the pixels around the edge. This problem appears only for the lapped transforms and not for block transforms like DCT. Although this may not affect the human perception thus subjective assessment of the reconstructed image but it still affects the objective assessment. When the distortion will be evaluated we will take into consideration this imperfection at the borders as well. Next figure illustrates the reasons behind this behavior.
Figure 5: Mapping of pixels using MLT In figure 5 the case of one-dimensional transform is depicted. Each block of 8 pixels from the original data is transformed into a 8 pixel block. But in input of the transform is 16 pixels, thus we take 4 pixels from left side and 4 from the right side. For the first and last block of 8 pixels there is a need to add 4 extra pixels on each side (the two red blocks). In our implementation this 4 pixels are mirrored from the first 4 and the last 4 pixels. The blue blocks are the overlapping ones. They partially overlap with the neighboring 8 pixels blocks. The problem is that we introduce some additional information with the added pixels (red blocks) into the transformed image which cannot be extracted back when doing inverse transformation. As for the inverse transformation we compute the inverse transform matrix by doing the transpose of the forward transform matrix (we take advantage of the fact that we deal with orthonormal transform). This time the input of the transform module must be a block of 8 pixels and the output is 16 pixels. We still need to map the input to an output of 8 pixels thus at the output we do overlapping and sum the overlapped values in order to get the output sequence. Another technique that we have used in order to minimize the distortion at the borders is to duplicate the last and the first 8 pixels block before doing the inverse transform and take into account also the influence of this replicated blocks when computing the reconstructed data stream. The property of separability has been used to implement the two dimensional transform (2-D transform) by first doing one dimensional transform on each row of the input image and again one dimensional transform for every column. The functions that can be used for forward transform are: MLTenc1D (unidimensional transform ), MLT_enc (2-D transform) and for inverse transform : MLTdec1D and MLT_dec. The transformed matrix that we obtained for input image Lenna has almost all the information concentrated in the DC coefficient (the first value in a 8x8 block) the rest of the coefficients decay to 0. Next figure shows the first transformed block of image Lenna:
Figure 6: 8x8 pixels transformed block The purpose of the transformation process has been achieved. The pixels are not correlated as much as they are in the original image. There is still some correlation which we are going to eliminate in the next block of the encoder namely Quantization Block.
3. Quantization block
Compression is done starting with quantization. Thus each 8x8 block of the picture, after MLT has been applied, is quantized by dividing each element from the 8x8 matrix with the corresponding element from the quantization matrix and rounding to the nearest integer value. The image quality is obtained by selecting specific quantization tables. The quality levels range between 1 and 100, where 1 is the poorest image quality and the highest compression and 100 is the best quality and the lowest compression. In the project we use 50 quality level matrix because its a trade-off between image quality and compression level by providing a very good image quality and high compression. The quantization matrix used it is:
To obtain a better quality the matrix Q50 is multiplied with the (quality level)/50. Thus if we want to obtain a quality level of 100, Q50 matrix will be multiplied by 100/50. The elements of the new matrix will be then rounded to obtain positive values and situated between 1 and 255 interval. For the measurements in the project we used Q10, Q30, Q70 and Q90 quantization matrices to compare the differences in image quality with the image compressed with the standard Q50 matrix. The other quantization matrices Q10 and Q90 will be equal to:
Thus after quantization the coefficients situated in the upper-left corner of the image block have bigger values and corresponds to the lower frequencies to which the human eye is more sensitive. The rest of the coefficients will be zero and they would not be used for the reconstruction of the image (as a part of lossy compression). More measurements have been done for the above quantization matrices. In the pictures it can be seen the difference in image quality. The quantized coefficients are now prepared for coding. The coefficients will be quantized in ZigZag order, to compress much better the large number of zeros. The DC coefficients are differential coded and the AC coefficients are run-length coded. This process is necessary because the DC coefficients contain a big fraction of the image energy. In the differential coding from the DC coefficient of the current block is subtracted the DC coefficient from the previous block. The difference obtained is then encoded, to utilize spatial correlation of the DC coefficients from the neighboring blocks. Because the difference between DC coefficients is smaller, then so many bits are not needed for encoding. After the run length coding we will have something like (symbol, frequency of symbol), where the frequency of the symbol is the number of consecutive apparitions of a value. The encoded values will be stored in a single vector and then Huffman encoded. This process: ZigZag ordering, differential and run-length coding makes the entropy coding more efficient and easy.
Figure 7 :Left: Zigzag ordering for a 8x8 block; Right: Differential coding [1]
4. Entropy Encoding Block

After doing quantization thus lossy compression, this block aims to compress the data in the lossless manner. The biggest part of this block is about Huffman encoding. We have chosen Huffman instead Arithmetic coding because we found it easier to implement and it gives a better compression ratio than arithmetic encoding. The disadvantage of Huffman is that the codebook may be too large but this is not our case. The idea behind the Huffman encoding is to use it for the run length vector that is obtained in the Quantization phase.
4.1 Entropy Encoding Part 1 : Huffman Encoding

For the entropy coding we use Huffman algorithm, which uses fewer bits to encode with variable length codes the most probable symbols (more frequently) and more bits for the less probable symbols. The reason for using Huffman algorithm is because is much easy to implement and its not so complex as Arithmetic algorithm. Arithmetic algorithm is slower and it needs more time for decoding than Huffman, which requires only searching in the table look up the corresponding codeword for the symbols. Huffman algorithm generates compact and optimal codes which are stored in codebook. The encoded data and the codebook are needed at the decoder to obtain the initial symbols. The symbols will be represented with a prefix-tree code and the bit sequence that represents a symbol will never be a prefix for the bit sequence, which represents another symbol. In the project we use Huffman static algorithm and we generate a binary tree by taking the two probable symbols and add them to form a new symbol with a probability equal to sum of the other two symbols. The process continues until we have one symbol (it remains a single node-the root node). As a part of Huffman algorithm we use a function that generates the binary tree and the code words to encode the symbols. The Huffman dictionary, the binary tree, is created by adding zeros and ones to the last two probable codeword and keeps on doing this, until we dont have any symbols. The code words are stored in a vector having the same size with the number of symbols and sorted according to their corresponding symbol frequency. The lowest weights will be on the first positions and in this way it is more efficient and it is not necessary to keep searching the symbols starting from the first positions. Even if we have the last probable symbol on the first position when we will make decoding, the vector with the code words will be evaluated in a descending order to be more efficient and to save time. At the receiver side the decoding process takes place. The bit stream will be decoded using the codebook to decode the initially symbols. Thus from the received bit stream we compare a number of bits which is equal to the length of the corresponding codeword from the dictionary. If the values are the same we take the corresponding symbol and we store it in a vector. The code words from the vector will be evaluated in decreasing order, to be more efficient. Once we find a symbol we start to evaluate again the last values from the vector because it is more probable to find the most frequent symbols at the end of the vector. After we obtain the Huffman decoded values, the symbols are differential and run-length decoded and then rearranged in zigzag order. After encoding the run length vector we must provide the receiver side with some extra information in order to be able to reconstruct the Huffman codebook identically to the one used in the encoding side. The algorithm and techniques used for this are described in the next subchapter.
4.2 Entropy Encoding Part 2

At this point we have our picture encoded using the Huffman algorithm. In order to decode this data at the receiver, we need to have the same codebook. There are many possibilities to have the codebook at the receiver side. One of them is to send the codebook from encoder to decoder in such a way that encoder knows how to read it from the file. Another option is to send some data that can help the decoder to reconstruct the same codebook that has been used to encode the run length vector. A format for the data and a protocol has to be established for this case as well. The solution we have chosen is to send the symbols and the frequencies associated with them to the decoder so that we can rebuild the codebook. Even if this solution might be slower than sending the codebook directly, we are not concerned with the speed of the algorithms. First we will describe how the file will look like when having all the necessary information written in it. We may think of it as of a packet which has a header and a payload. The header contains the necessary information to rebuild the Huffman codebook and the payload contains the data that needs to be decoded using this codebook.
Header
Payload
Figure 8: File structure Basically the Header contains the information regarding the symbols and frequencies taken from the run length vector. The next step is to encode this information in such a way that the decoder knows how to interpret it. For this we have used a protocol. Because each symbol might be using a different number of bits to be represented a fixed symbol length would not be appropriate. Thus we represent each symbol with a variable number of bits and for delimitation between consecutives values we used a standard sequence of bits: 0 1 1 1 1 0. Before being encoded using this protocol the information is structured like in the next figure:
Figure 9: Header structure before encoding The header comprises of two rows: one with the symbols and the other with the frequencies. The columns are sorted so that we have the symbols in a increasing order with value of 0 being somewhere in the middle. We start by taking the first symbol and transform it to binary representation. The result is written in some vector and then the delimiter 0 1 1 1 1 0 is added to the vector. Next the frequency 10
associated with this symbol is transformed to binary and added followed by the delimiter. After this we move to the next column and keep doing this until the end of the header. We based our protocol on the fact that the delimiter will not appear in the middle of the data very often. In the case that during binary transformation we encounter the delimiter 0 1 1 1 1 0, then a 0 is added after each sequence of three consecutive 1. At the receiver side , whenever the decoder finds three consecutive 1 it extract the next bit which should be the 0 we have inserted previously and whenever finds the delimiter reads the bit sequence and does the transformation from binary to decimal reconstructing the header as illustrated in figure 9. We do not encode the sign of the symbol. For this we take advantage of the header being sorted. The decoder adds negative sign in front of the symbols starting from the first one until it finds 0 or it discovers that the order of symbols is switching from decreasing to increasing. At this point we have two vectors with binary values, the first one is encoded using Huffman algorithm and it represents the picture itself and the second one represents the information needed to reconstruct the Huffman codebook and it is encoded using the protocol we have described earlier. Next logical step would be to write these two vectors into a file. When writing to a file in Matlab we have to specify what data type we are writing (int8, int16, int32etc. ). This way, each value we are writing will be represented in the file using the precision for that specific data type. The smallest precision is for uint8. Using this each value will be represented using only 8 bits. The problem is that our values in the two vectors are bits and there is a huge amount of wasted capacity because we represent 1 bit using 8 bits. In order to address this issue we came up with a solution that takes bits in blocks of 8 and transforms this block of 8 bits into the corresponding integer value. The resulted integer is then written into the file as a uint8 data type thus using 8 bits. In this way we have taken 8 bits and represent them into the file using 8 bits. Therefore before writing the binary header into the file we transform it into a vector of integers. The same goes for the binary vector coded using Huffman algorithm. The two resulted vectors of integers are finally written into the file. The decoder has to take the vectors from the file and represent the integer values in binary obtaining the binary header and the Huffman binary vector. Then, from the binary header, using the protocol presented above, it computes the header with symbols and frequencies associated to the symbols. Using this header information the codebook is easily reconstructed and the Huffman algorithm can be applied in order to obtain the run length vector.
11
5. Measurements
The first part of the measurements chapter comprises the results about code length and estimates of the entropy in different points during the encoding process. The measurements are performed on Lenna image.
5.1. Code length measurements and entropy estimation

In the quantization stage we have used 5 different quantization tables thus we expect that encoder/ decoder modules to perform different for each case. The most common quantization table that is used also in JPEG standard is Q50. First we will perform some measurements on the test data for this case (with Q50) and then well compare and relate to the other cases (Q10, Q30, Q70 and Q90). The code length is computed as the final file size expressed in bits divided by the total number of pixels in the compressed image. The result is the following: Code length =215872 bits / 262144 pixels= 0.8235 bits/pixel Using the code length we may compute the compression ratio as well: Compression ratio: 8 / 0.8235=9.7146 We assumed that the initial image needs 8 bits to represent a pixel. In order to evaluate how good is our compression algorithm, we have to relate the code length to the entropy estimate. There are many ways of estimating the entropy. The entropy is going to be estimated using the probability model of the source by applying the following formula: In the first hand we estimate the entropy in the original image without taking into consideration any correlation between pixels. We make the assumption of i.i.d. pixels. The entropy in this case is H=7.2185.It is not relevant to compare this entropy to the code length but if we assume that initially one pixel is represented using 8 bits (if there is no correlation and each pixel is coded individually)thus the rate is 8 bits/pixel then the entropy estimate is very close to this value. The purpose of the transform block is to decorrelate the initial image, thus improving the coding. If we estimate the entropy of the matrix containing the transform coefficients the result is: H=4.3889. For this case the entropy has decreased significantly because the pixels are not as correlated to each other as in the original image. Still the difference between the code length and the entropy estimate is very big. The explanation for this fact is that because of the Quantization block a part of the information contained in the image is discarded. Quantization is not about improving the coding gain but only about discarding some of the information which is not so relevant in order to have a good reconstruction of the original image. The entropy estimate of the quantized coefficients is H= 0.8591. This time we notice that the entropy estimate is very close to the code length. But the entropy should be the lower limit, so the code length should be close but higher than the entropy. Even if the entropy estimate in this case is very close to the code length that we obtained, there is still some correlation that we havent captured. And
12
this correlation is related to the run length coding. Therefore the coding algorithm uses this technique but our entropy estimate does not capture the effect of run length encoding. The code length that we have computed takes into account the run length coding and the extra information added into the file to help reconstructing the codebook at the receiver thus it is not a very precise technique if we intend to evaluate the performance of our Huffman algorithm. In order to focus only on Huffman coding part we may consider as an input the run length vector and calculate the code length relative to the symbols in the run length vector. Next picture depicts the idea:
Figure 10 : Part of encoder from Q50 quantization case The entropy estimate for the run length vector is written in the figure. The code length may also be computed in two ways: first one is to take into account the header information thus the whole file size and second one is to take into account only the run length vector transformed in binary by the Huffman algorithm. For the first case the code length is: C1=215872 bits/64338 symbols== 3.3553 bits/symbol And for the second case we have: C2=3.3097 bits/symbol As we notice this time the measurement and the entropy estimation comply with the general rule for Huffman encoding: ,where L is the average length of the code. Also the code length C1 is bigger because of the header information that is written into the file. C1 is more relevant to use as a comparison term than C2, if we want to evaluate the encoder as a whole. If we think that there are better ways of compressing the header and we only want to evaluate how the Huffman algorithm performs without taking into account how well we have compressed the extra information needed for decoding then it makes more sense to compare the entropy with C2. Next table presents the measurements for all the quantization tables used: Q10 77384 0.3486 2.9145 0.2952 3.0078 2.9671 27.1 Q30 140904 0.5849 3.1902 0.5375 3.2994 3.2556 14.9 Q50 215872 0.8591 3.2596 0.8235 3.3553 3.3097 9.7 Table 1. Q70 387992 1.4653 3.2641 1.4801 3.3334 3.2894 5.4 Q90 568632 2.0290 3.2935 2.1691 3.3644 3.3126 3.6
File size(bits) H1 H2 C1(bits/pixel) C2(bits/symbol) C3(bits/symbol) Comp. ratio
H1- entropy estimation for the Quantized coefficients; 13
H2- entropy estimation for the run length vector; C1 code length relative to the image (file size divided by 512x512 pixels); C2- code length relative to the run length vector (file size divided by the number of symbols); C3- code length relative to run length vector but without header information (binary response of the Huffman encoder divided by the number of symbols in the run length vector). In order to evaluate the influence of the header information, we plotted the difference between the rows C2 and C3. The X axis correspond to the different quality levels according to the Quantization tables used and they are ordered as in Table 1 from Q10 to Q90.
Figure 11: C2-C3 for each quantization table used Notice that the plot has an increasing trajectory with an exception for Q70 which is going to be motivated later. This result says that header information becomes more and more relevant and significant, once we increase the quality of the reconstructed image by lowering the quantization. The best quality we get for Q90 table and this is because the quantization does not discard to much information from the image. When we use for quantization table Q10 (or Q30) the result will have many values equal to 0, thus the run length vector will have fewer symbols. If we use for instance Q90, we wont get so many values of 0 and the gain of run length encoding will be reduced very much therefore the run length vector will contain many different values and will have a big size. Because the gain in compression of the run length algorithm decreases when we use less quantization (Higher rank of the quantization table i.e. Q90) this results in a run length vector with increased size thus in a bigger header. In this way the overhead is larger if we want to achieve better quality. The strange behavior we get for the Q70 table is because Q70 has been built by us while Q10 ,Q50 and Q90 were taken from the standard. Thus it may be the case that Q70 is not calibrated as it should.
14
5.2. Objective and Subjective assessment

This subchapter deals with measuring the distortion between the original and reconstructed image. This is a way of evaluating how close the reconstructed image resembles the initial picture. Another way is to perform a subjective assessment by comparing the two images. Subjective assessment usually gives better results in a sense that it reflects closer the quality and specially the difference in quality between the two images. Because is much difficult to make such subjective quality assessments , many times objective measurements are sufficient. Here we have performed objective assessment for the several cases of compression that we have but also subjective assessment. In order to be relevant and precise, subjective assessment must be made using a big number of persons. This is not the case for the project we are working on. In order to objectively assess the difference between one of the reconstructed image and the original Lenna image we chose to compute the PSNR. There are some other methods that can be used, but that is not the scope of this report and project. PSNR reflects the difference in quality between two pictures. A bigger value of the PSNR suggests that the reconstruction is very close to the original image. The formula used to compute the PSNR is:
Where
is the mean square error between individual pixels in the two images and is computed as:
A good way of observing the tradeoff between the distortion and the rate of the code is to create a rate-distortion graph. The next figure(figure 12) illustrates the rate-distortion curves for five different quantization tables that are used inside the compression algorithm (Q10Q90). The graphs shows that at low rates, thus poor quality , the PSNR is smaller than for higher rates, which is what we expected for. An important thing is the logarithmic shape of the curve. This tells us that the PSNR increases faster when we are in the low rate (low quality) zone of the curve, until it reaches a saturation point where even if the rate, thus quality, is increased, the PSNR increases at a slower pace. This somehow resembles the human quality perception. We shall discuss this after the subjective assessment. The following PSNR values were calculated for different quantization matrices for the image Lenna: Quality level 10 30 50 70 90 PSNR [dB] 31.48 34.89 36.88 39.10 40.81 Table 2 rate [bits/pixel] 0.259 0.537 0.822 1.48 2.169
15
Figure 12: Rate-distortion graph As for the subjective quality assessment, the next five figures depict the reconstructed images with different quality:
Figure 13: Lenna reconstruction : left -> Q10, right ->Q30
16
Figure 14: Lenna reconstruction : left -> Q50, right ->Q70
Figure 15: Left: Lenna reconstruction with Q90, Right: Lenna Looking at the reconstructed images we notice the same pattern of low quality when we use a lower rank quantization table (Q10 or Q30). The quality of the decoded picture is reflected by the rate of the code, therefore low rate leads to poor quality. It can be seen that reconstruction with Q90 is almost perfect, the compression is lower and the file size is bigger. When we use Q10 and Q30 quantization matrices for the reconstruction of the image, the quality is poor, but the compression is higher and the size of the file is smaller. The use of higher quality levels matrices gives better quality images since after quantization with these matrices, doesnt result so many values of zeros which dont participate at the reconstruction of the image. If we are to trace a curve with the subjective assessment it would have a logarithmic shape, because the quality (in or perception, though it may not be the case for other people) increases faster 17
when the rate is low. This translates into the fact that human visual system is more sensitive to when the quality is poor. If the quality is good (i.e. Q70 or Q90 reconstructions) we cant distinguish or notice the difference in quality. As a final conclusion for this subchapter we would say that our PSNR measurement resembles, at least the trend of, the subjective assessment. In order to obtain a precise graph for the subjective assessment we should develop a grading system and have many tests to grade the images. In any case it is easy to say that the potential trace would probably look like the PSNR graph. There is an issue that we havent addressed so far. It is about our transform block. We must not forget that this block, in spite the fact that it should offer perfect reconstruction, introduces some distortion on the borders of the reconstructed image. When we have computed the PSNR we didnt take into account this effect. If we plot the difference between the new PSNR (PSNR2) and the old PSNR (PSNR1) against the rate in this case we get the following graph:
Figure 16: Difference between the PSNRs versus rate If for low quality images the difference is small is becomes more relevant for high quality (when we quantize the transform coefficients using smaller values). Thus the distortion induced by the transform block becomes relevant when moving towards higher quality of the reconstructed image. This is valid only for objective assessment because when we talk about subjective assessment, the four pixels affected by distortion at the borders of the image are not noticeable by the eye.
18
6. Conclusion
For the decorrelation of the image in the project it was used Modulated Lapped Transform. This technique is better than DCT because eliminates the blocking effect. This is done by using basis functions that decay smooth to 0 and are longer than in DCT case, yielding an overlap of the samples between neighboring blocks. In our case the basis functions have a length of 16 samples and thus the overlapping between neighboring blocks is 4 samples on each side. We showed that even if at the borders there are some imperfections in the reconstructed image (because we add some samples), it is almost identical with the original one. Thus only 4 pixels for each edge were affected by the distortion. Applying the MLT transformed we decorrelate the pixels. For the quantization we used the standard quality level 50, for which we obtained good results. However the measurements were done also for the 10, 30, 70 and 90 quality levels. For the higher quality levels like 70 and 90 we obtained a good quality image, a small compression thus a bigger file size. That is because the quantization for the higher quality levels doesnt discard so much information related to the reconstruction of the image as the lower levels do. For the entropy encoding we have used Huffman algorithm because its more simple and fast than arithmetic coding. After the MLT transform, quantization, run length coding and Huffman coding the data was put in a file and sent to the decoder. In the header of the file we kept the information necessary for the reconstruction of the codebook in order to be able to decode the symbols. The average code length obtained with Huffman algorithm was comparable with the entropy. The estimates for the entropy were done: on the original image, after MLT transform, after quantization and after the run length vector. The code length was measured relative to the image, relative to the run length vector and relative to the run length vector without the header. We noticed that after each step, the estimation of the entropy was better and the Huffman length code was closer to the entropy. That is because after each step the pixels are more decorrelated, leading to lower entropy and better compression. The second part of the measurements comprises the rate-distortion graphs. The measurements were performed by taking into account the quality levels used. The code rate is higher for the higher quality levels, thus the PSNR grows as the number of bits/pixel increases. The measurement of the PSNR was done in both situations: by omitting the imperfections on the borders (the first and last four pixels) and by taking into account the imperfections. For the subjective assessment, when we plotted the difference between the two PSNRs (the error) against the rates it can be seen that the distortion grows faster for the higher rates, which means that the distortion in quality of the image it is more noticeable for higher quality levels, than for the small ones. The compression algorithm used is similar with JPEG but MLT transformed is used instead of DCT to eliminate the blocking effects. The results obtained are good and the quality of the reconstructed image is close to the original picture. As for the future work, improvements have to be made in order to achieve perfect reconstruction at the borders of the image. Also it would be very interesting to compare the MLT technique against DCT to observe how well the overlapping reduces the blocking effect.
19
7. Bibliography
1) Til Aach : Fourier, Block and Lapped Transforms, Institute for Signal Processing, University of Lubeck; 2) Henrique S. Malvar : Extended Lapped Transforms: Properties, Applications, and Fast Algorithms, 1992, IEEE Transactions on signal processing vol. 40, no.11 ; 3) Khalid Sayood : Introduction to Data Compression ,Third edition, 2005; 4) Gregory K. Wallace, Multimedia Engineering, Digital Equipment Corporation, Maynard, Massachusetts :The JPEG still compression standard, 1991, IEEE Transactions on Consumer Electronics; 5) Tinku Acharya, Ajoy K. Ray : Image processing Principles and Applications, 2005; 6) Cornelius T. Leondes, Database and Data Communication Network Systems, Volume 1, Academic Press 2002.
20
8. Appendix
The source code is uploaded in electronic format. Here we will give just a short introduction about each function we have used to encode and decode the image. The implementation is structured in 2 main parts: encoding part and decoding part. There is a program that computes the Transform matrix and plots the basis functions: MLTmatrix.m . We have defined the five quantization tables under the name Q10, Q30, Q50, Q70, Q90. Each time someone wishes to run encoder or decoder, these matrixes must be uploaded into the workspace. 8.1 Encoding part The main script in the encoding part is encoding1.m. In this script we call individual functions that are related to different block in the encoder. These are going to be described in the following: MLT_enc.m function : inputs (<initial image>,<transform matrix> ) ; output : coefficients matrix ; quant_enc.m function : inputs (<coefficients matrix>,<transform matrix>); output : run length vector of the quantized coefficients matrix; buildmat.m function : input (<run length vector>); output : header information (symbols and frequencies from run length ; vector); huffman_dict.m function : input(<run length vector>); outputs: codes and symbols associated to each code; runl2bin.m function :inputs (<run length vector>,<codes>,<symbols>); output: binary representation of run length vector coded with Huffman algorithm; linecode_enc.m function : input(<header info>); output : binary coded header information using the protocol described in section 4.2; putinteger2.m function : input (<binary vector>); output : uint8 vector; filewrite.m function : input (<header info>,<binary run length vector>,<name of the file>);
8.2 Decoding part The main script in this part is decoding1.m which call the following functions : fileread.m function : input (<file name>,<length of the header>); output : header and run length vector in uint8 data format; putbits.m function : input(<uint8 data format vector>); output : binary representation of the input vector; linecode_dec.m function : input (<binary header>); 21
output : header information (symbols and frequencies); huffman_dict_rebuild.m function : input (<header >); output : codes and symbols for Huffman algorithm; bin2runl.m function : input(<binary run length vector> ,< codes>,<symbols> ); output : run length vector ; quant_dec.m function : input (<run length vector>,<quantization table>); output : transform coefficients; MLT_dec.m function : input (<transform coefficients>); output: reconstructed image;
22

MLT Report Vfinal2

Uploaded by

Copyright:

Available Formats

You might also like

MLT Report Vfinal2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MLT Report Vfinal2

Uploaded by

Copyright:

Available Formats

Modulated Lapped Transform based image compression

Course : Data Compression Students: Paula Popa Cosmin Caba

Binary bit stream

4. Entropy Encoding Block

4.1 Entropy Encoding Part 1 : Huffman Encoding

4.2 Entropy Encoding Part 2

5.1. Code length measurements and entropy estimation

File size(bits) H1 H2 C1(bits/pixel) C2(bits/symbol) C3(bits/symbol) Comp. ratio

H1- entropy estimation for the Quantized coefficients; 13

5.2. Objective and Subjective assessment

Figure 13: Lenna reconstruction : left -> Q10, right ->Q30

Figure 14: Lenna reconstruction : left -> Q50, right ->Q70

You might also like