Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 31

Compression

Need for data compression


Time: Calculate Time taken to transmit a file
File Size: Calculate storage (memory) for files
Methods of compressing data
• lossless & lossy
Extensions:
• File types
• Run Length Encoding

1
Need for Compression
Speed - transmission Size – memory storage

2
Importance of File Compression when
transmitted on the internet
• Speed is important
• Big files will take
longer than small files
• Compress files to
make them smaller
• Small files take less
time to transmit over
a network

3
Calculate time required to
transmit a file
• bits per second – bps (NOT bytes)
• Example: file 525 KB (a novel) at 2 Mbps
– (525 X 1000 bytes) X 8 bits / 2 million
– 4200000 / 2000000 sec
– = 2.1 seconds
• See also Section 4 Networks 4b Data
transmission
• Try this link: (access denied at school!)
• DownLoad Calculator
4
Calculate time required to
transmit a file - BINARY
• bits per second – bps (NOT bytes)
• Example: file 525 KiB (kibibytes) at 2 Mbps
– (525 X 1024 bytes) X 8 bits / 2 million
– 4300800/ 2000000 sec
– = 2.15 seconds
• See also Section 4 Networks 4b Data
transmission
• Try this link: (access denied at school!)
• DownLoad Calculator
5
Importance of File Compression and
file size and memory
• Amount of memory
can be important
• Small files take up
less space
• Compact disk (CD)
– 10 songs CD audio
– 120 songs MP3

6
Calculate storage requirements for files
binary in red
Text Images
• 1 character = 1 byte • Resolution (how many
– (Num of chars) = (Num of bytes)
• "Love minus zero no limit."
pixels)
– 25 bytes (spaces and "." count!) • Colour depth (how many
• Characters in a book? bits per pixel)
– 350 words / page – Total bits = Resolution X
– 6 chars / word Depth
– 350 X 6 chars = 2100/page – Bytes = Total bits / 8
– 250 pages in a book • Example:
– 2100 X 250 = 525000 chars – Size (284 X 177) pixels
– Res 8 bits (256 colours)
– 525000 bytes – Bits = 284 x 177 X 8
– 525 kiloBytes = 0.5 MB – = 402 144 bits
– 525000 / 1024 = 512.7 KiB – Bytes = 50268
– – approximately 50 kiloBytes
512.7/ 1024 = 0.5 MiB
– exactly 50268 / 1024 = 49.1 kB 7
Compression Algorithms

8
Lose data or don’t lose data

LOSSY & LOSSLESS

9
Compress

• Lossless
• Decompressed data identical to original
• No data has been lost
• Lossy
• Decompressed NOT same as original
• Some data has been lost 10
2 Ways to Compress Data
Lossless & Lossy

11
Lossless Data Compression
Data compression algorithms ..

• Original Data can be


perfectly
reconstructed from
the compressed data
• Kind of files:
– Text Files
– Program code
• Examples:
– ZIP file format

12
Lossless Compression
Lookup table

• Original index -> Word


– “ask not what your 1 ask
country can do for
you”
2 not
• Compressed as: 3 what
– 123456789 4 your
• Uncompressed: 5 country
• “ask not what your 6 can
country can do for 7 do
you”
8 for
9 your
13
Lossless Compression
Pattern Substitution – repeating words

14
Lossy Compression
Data lost – can never go back

15
Lossy Compression
Data encoding where some detail is removed

• The only real method for


photographs
– pixel sequences are
unpredictable
– fewer pixels, fewer colours
• JPEG
• GIF
• Remove data that a
human won’t notice
• Original cannot be
restored

16
Lossy Compression
Data encoding where some detail is removed

• Also used for audio &


video files
– MP3
– MPEG
• Remove data that a
human won’t notice
• Original cannot be
restored

17
Lossy example

18
EXTENSION – Specific File
Types
• A very interesting topic
• But no need to know specific file types
such as JPEG or MP3
• NO LONGER REQUIERED for Edexcel
GCSE Computer Science

19
JPEG – Joint Photographic Experts Group
• Most common format for
storing & transmitting
photographic images on
the world wide web
– .jpg
• lossy
– colour depth 24 bits
– Red (8), Green (8), Blue
(8)
– 16.7 million colours

20
GIF – Graphics Interchange Format
• Good for large areas of
one colour or
• small web logos & simple
animations
– .gif
• lossy
– colour depth 8 bits
– 256 colours
• being replaced by PNG
– Portable Network Graphics
– .png
– colour depth 24 bits
– no animations
21
MP3
Audio part of MPEG-1
• Audio files
– .mp3
• Lossy
• Download music from
internet

22
MPEG
Moving Picture Experts Group
• Video files
• MPEG-1
– low resolution, small
memory
– .mpg
• MPEG-2
– high resolution, large
memory
– .mp2
• lossy

23
PDF Portable Document Format
• Represents documents
independently of
– application software
– hardware
– operating system
– .pdf
• a PDF file describes all
the details of a file
needed to display it
– text, fonts & graphics
• Lossless for text
• Lossy for images

24
Bitmap – raw image data
• Device independent
image files
• Photo editing
• large file sizes
– .bmp
• lossless
– colour depth 24 bits
– Red (8), Green (8), Blue
(8)
– 16.7 million colours
– others exist too:
• 1 bit up to 32 bits

25
EXTENSION – Run Length
Encoding
• A very interesting topic
• Fun to code for …
• NO LONGER REQUIERED for Edexcel
GCSE Computer Science

26
Lossless Compression
Run Length Encoding RLE
RUNs of Text BBBBBBBBBBBBB

There are lots of "RLE Calculators" on


the internet which can be fun
Math Celebrity RLE
27
"RUNs" of Numbers

28
"RUNs" of Pixels

29
Run Length Encoding - Algorithm

The idea is to replace lengths of repeated letters with a number and the character
repeated.
Just work along the string (Iterate with a loop) ...
AAABBBBCADDEE
IF the next letter is the same THEN add one to a counter ... and carry on to the next
letter.
ELSE the next letter is different to the current letter so add (concatenate) the current
letter and the current count to an output string, put the counter back to 1 and carry on to
the next letter.

AAABBBBCADDEE
A = A so count = 1+1
A = A so count = 2+1
A <> B so write down "A3" set count = 1
B = B so count = 1+1 .........

Final Result:
A3B4C1A1D2E2
This could be read as: “There are 3 As, then 4 Bs, then 1 C , then 1A ....”
30
Run Length Encoding - Algorithm

FUNCTION RunLenEncode (string inText)


BEGIN FUNCTION
SET outText TO “ “ # empty string ready to hold the answer
SET textLength TO LENGTH (inText) #integer length of string
SET letCount TO 1 # initialise letter counter to 1

#start with first letter of the string and loop along letter by letter
FOR i = 0 to (textLength-1) DO #note string index, like lists & arrays start at 0
#Compare this letter with next letter on right (note <> is not equal )
IF (inText[ i ] <> inText[ i + 1 ] ) OR ( i + 1 = textLength) THEN
#must be a different next letter or there is no next letter as “off the end” of string
#so join letter & count (string concatenation) to outText
SET outText TO outText + inText[ i ] + String(letCount)
SET letCount TO 1 # reset letter counter to 1
ELSE:
SET letCount TO letCount + 1 #same letter so increment counter
END IF
NEXT i in FOR loop

RETURN outText # “pass out” the encoded text and leave the function
END FUNCTION
31

You might also like