Professional Documents
Culture Documents
Compression File TypesV7 2022
Compression File TypesV7 2022
1
Need for Compression
Speed - transmission Size – memory storage
2
Importance of File Compression when
transmitted on the internet
• Speed is important
• Big files will take
longer than small files
• Compress files to
make them smaller
• Small files take less
time to transmit over
a network
3
Calculate time required to
transmit a file
• bits per second – bps (NOT bytes)
• Example: file 525 KB (a novel) at 2 Mbps
– (525 X 1000 bytes) X 8 bits / 2 million
– 4200000 / 2000000 sec
– = 2.1 seconds
• See also Section 4 Networks 4b Data
transmission
• Try this link: (access denied at school!)
• DownLoad Calculator
4
Calculate time required to
transmit a file - BINARY
• bits per second – bps (NOT bytes)
• Example: file 525 KiB (kibibytes) at 2 Mbps
– (525 X 1024 bytes) X 8 bits / 2 million
– 4300800/ 2000000 sec
– = 2.15 seconds
• See also Section 4 Networks 4b Data
transmission
• Try this link: (access denied at school!)
• DownLoad Calculator
5
Importance of File Compression and
file size and memory
• Amount of memory
can be important
• Small files take up
less space
• Compact disk (CD)
– 10 songs CD audio
– 120 songs MP3
6
Calculate storage requirements for files
binary in red
Text Images
• 1 character = 1 byte • Resolution (how many
– (Num of chars) = (Num of bytes)
• "Love minus zero no limit."
pixels)
– 25 bytes (spaces and "." count!) • Colour depth (how many
• Characters in a book? bits per pixel)
– 350 words / page – Total bits = Resolution X
– 6 chars / word Depth
– 350 X 6 chars = 2100/page – Bytes = Total bits / 8
– 250 pages in a book • Example:
– 2100 X 250 = 525000 chars – Size (284 X 177) pixels
– Res 8 bits (256 colours)
– 525000 bytes – Bits = 284 x 177 X 8
– 525 kiloBytes = 0.5 MB – = 402 144 bits
– 525000 / 1024 = 512.7 KiB – Bytes = 50268
– – approximately 50 kiloBytes
512.7/ 1024 = 0.5 MiB
– exactly 50268 / 1024 = 49.1 kB 7
Compression Algorithms
8
Lose data or don’t lose data
9
Compress
• Lossless
• Decompressed data identical to original
• No data has been lost
• Lossy
• Decompressed NOT same as original
• Some data has been lost 10
2 Ways to Compress Data
Lossless & Lossy
11
Lossless Data Compression
Data compression algorithms ..
12
Lossless Compression
Lookup table
14
Lossy Compression
Data lost – can never go back
15
Lossy Compression
Data encoding where some detail is removed
16
Lossy Compression
Data encoding where some detail is removed
17
Lossy example
18
EXTENSION – Specific File
Types
• A very interesting topic
• But no need to know specific file types
such as JPEG or MP3
• NO LONGER REQUIERED for Edexcel
GCSE Computer Science
19
JPEG – Joint Photographic Experts Group
• Most common format for
storing & transmitting
photographic images on
the world wide web
– .jpg
• lossy
– colour depth 24 bits
– Red (8), Green (8), Blue
(8)
– 16.7 million colours
20
GIF – Graphics Interchange Format
• Good for large areas of
one colour or
• small web logos & simple
animations
– .gif
• lossy
– colour depth 8 bits
– 256 colours
• being replaced by PNG
– Portable Network Graphics
– .png
– colour depth 24 bits
– no animations
21
MP3
Audio part of MPEG-1
• Audio files
– .mp3
• Lossy
• Download music from
internet
22
MPEG
Moving Picture Experts Group
• Video files
• MPEG-1
– low resolution, small
memory
– .mpg
• MPEG-2
– high resolution, large
memory
– .mp2
• lossy
23
PDF Portable Document Format
• Represents documents
independently of
– application software
– hardware
– operating system
– .pdf
• a PDF file describes all
the details of a file
needed to display it
– text, fonts & graphics
• Lossless for text
• Lossy for images
24
Bitmap – raw image data
• Device independent
image files
• Photo editing
• large file sizes
– .bmp
• lossless
– colour depth 24 bits
– Red (8), Green (8), Blue
(8)
– 16.7 million colours
– others exist too:
• 1 bit up to 32 bits
25
EXTENSION – Run Length
Encoding
• A very interesting topic
• Fun to code for …
• NO LONGER REQUIERED for Edexcel
GCSE Computer Science
26
Lossless Compression
Run Length Encoding RLE
RUNs of Text BBBBBBBBBBBBB
28
"RUNs" of Pixels
29
Run Length Encoding - Algorithm
The idea is to replace lengths of repeated letters with a number and the character
repeated.
Just work along the string (Iterate with a loop) ...
AAABBBBCADDEE
IF the next letter is the same THEN add one to a counter ... and carry on to the next
letter.
ELSE the next letter is different to the current letter so add (concatenate) the current
letter and the current count to an output string, put the counter back to 1 and carry on to
the next letter.
AAABBBBCADDEE
A = A so count = 1+1
A = A so count = 2+1
A <> B so write down "A3" set count = 1
B = B so count = 1+1 .........
Final Result:
A3B4C1A1D2E2
This could be read as: “There are 3 As, then 4 Bs, then 1 C , then 1A ....”
30
Run Length Encoding - Algorithm
#start with first letter of the string and loop along letter by letter
FOR i = 0 to (textLength-1) DO #note string index, like lists & arrays start at 0
#Compare this letter with next letter on right (note <> is not equal )
IF (inText[ i ] <> inText[ i + 1 ] ) OR ( i + 1 = textLength) THEN
#must be a different next letter or there is no next letter as “off the end” of string
#so join letter & count (string concatenation) to outText
SET outText TO outText + inText[ i ] + String(letCount)
SET letCount TO 1 # reset letter counter to 1
ELSE:
SET letCount TO letCount + 1 #same letter so increment counter
END IF
NEXT i in FOR loop
RETURN outText # “pass out” the encoded text and leave the function
END FUNCTION
31