Professional Documents
Culture Documents
25 - Ch03-Data-Storage & Appendix H
25 - Ch03-Data-Storage & Appendix H
Data Storage
&
Appendix H
Error Detection and Correction
Ting-Yi Chang
Professor
National Changhua University of Education, Taiwan, R.O.C.
Email: tychang@cc.ncue.edu.tw
Tel: 04-7232105 ext.7381
Ch3-Storgae- 1
Outline
Data types
Error detection and correction (Appendix H)
Storing numbers
Storing text
Storing audio
Storing images
Storing video
Ch3-Storgae- 2
Objectives
Ch3-Storage- 4
Data inside the computer
Ch3-Storage- 5
Data inside the computer
Ch3-Storage- 7
Error detection and correction
Ch3-Storage- 8
Appendix H
Ch3-Storgae- 9
身分證字號產生規則
1. 台灣身分證字號共有十碼 , 我們就將它表示成 ... N1 N2 N3 N4 N5 N6 N7 N8 N9
N10
2. N1 : 一定是一個大寫英文字母 , 代表戶籍地 ex.A 是台北市 B 是台中市
A=10 B=11 C=12 D=13 E=14 F=15 G=16 H=17 J=18 K=19 L=20 M=21
N=22 P=23 Q=24 R=25 S=26 T=27 U=28 V=29 W=30 X=31 Y=32 Z=33
I=34 O=35
N1 ==>N1-1, N1-2
3. N2 : 性別欄位 : 1 為男性 2 為女性
4. N3 ~ N9 : 流水號
5. N10 : 檢測字元
6. 然後用下列算式計算 , 若餘數為 0 則為正確的身分證字號
(N1-1+N1-2×9+N2×8+N3×7+N4×6+N5×5+N6×4+N7×3+N8×2+N9+N10) mod 10 =
0
Example :
1. F212345674 換算為 15 212345674
2. (1×1)+(5×9)+(2×8)+(1×7)+(2×6)+(3×5)+(4×4)+(5×3)+(6×2)+(7×1)+(4×1)=150
3. 150 mod 10 = 0 => Correct
Ch3-Storage- 10
Error Detection and Correction (1/4)
• When data is transferred from one place to another, or moved
from one device to another, the accuracy of the data must be
checked.
• Some applications can tolerate a small level of error
– Random error in audio or video transmission
• Most applications such as text expect a very high level
accuracy
Ch3-Storage- 11
Error Detection and Correction (2/4)
• Interference ( 干擾 )
– Due to the crosstalk ( 線路干擾 ), external electromagnetic field ( 磁
場 ), and so on
• Single-bit error vs. burst error (two or more bits)
(See Figure H.1)
Ch3-Storage- 12
Error Detection and Correction (3/4)
• The central concept in correcting error is redundancy
– Send extra bits with our data
• Two main methods of error correction
– Forward error correction
• Receiver tries to guesses the message by using redundant bits
– Retransmission:
• The receiver detects the occurrence of an error and asks the
sender to resend the message
Ch3-Storage- 13
Error Detection and Correction (4/4)
• Redundancy is achieved through various coding schemes.
– The sender adds redundant bits through a process that create a
relationship between the redundant bits and the actual data bits
– The receiver check the relationships between the two sets of bits to
detect or correct the errors
Ch3-Storage- 14
Coding scheme
• We can divide coding scheme into two board categories:
– Block coding
– Convolution ( 迴旋 ) coding
• This book concentrate on block coding
– Convolution ( 迴旋 ) coding is more complex and beyond the scope of this
book
• Block coding
– Divide a message into blocks, each of k bits, called datawords.
– Then r redundant bits to each block are added to make the length n = k + r
– The resulting n-bits blocks are called codewords
Ch3-Storage- 14
Example H.2 Error detection
• The sender encodes the dataword 01 as 011 and sends it to the receiver
– If the receiver receives 011 the data is OK
– If the receiver receives 111 the data is corrupted
• However, if two bits is corrupted during transmission and 000 is
received
– The error can not be detected
– We need a better coding scheme
Ch3-Storage- 16
Example H.3 Error correction
Ch3-Storage- 19
H.4 Error detection:
Cyclic Redundancy Codes (CRC)
- Widely used in networks such as WANs and LANs
- Approach:
– Message : M; Polynomial: P (predefined)
– Step 1: Compute M/P = Q(Quotient) … R(Remainder)
– Step 2: Transmit M || R
Remainder R is appended to the dataword M
- Example :
– Message : M= “1001” ; Polynomial P = “1011(x3+x+1)”
– Step 1: Compute 1001000/1011=1010…110 (assume codeword is 7-bit)
– Step 2: Transmit M || R= “1001110” (Q is discard and R is appended to M)
- The decoder dose the same division process, if Remainder is 0, it
means the received data is correct, otherwise the data is wrong.
- See Figures H.8 and H.9 in pages 578 and 579 for more details
- The performance of cyclic codes is good in detecting single-bit
errors, double errors, and burst errors.
Ch3-Storage- 20
Error detection:
Cyclic Redundancy Codes (CRC)
Ch3-Storage- 21
Error detection:
Cyclic Redundancy Codes (CRC)
Ch3-Storage- 22
Error detection:
Cyclic Redundancy Codes (CRC)
Ch3-Storage- 23
H.5 Error detection: Checksum
- Used in networks by several protocols
- Traditionally, the IP protocol has used a 16-bit checksum
- In addition to sending the numbers, we also send the sum of
them
– For example, if the set of numbers is (7, 11, 12, 0 , 6), we sent (7,
11, 12, 0 , 6, 36), where 36 is the sum of the original numbers
- Checksum is not as strong as the CRC in its error-checking
capacity.
– If the value of one word is incremented and the value of
another word is decremented by the same amount, the two
errors cannot be detected.
- See Example H.9 and Fig H.10 in pages 580 for more
details
Ch3-Storage- 24
Error Correction: Hamming Code
Ch3-Storage- 25
Hamming code example:
1. 先取 R bits 的檢查碼: M +R+1≦ ,資料位元 M ,檢查
位元 R 。如 8 bits 資料,則 9+R≦ , R= 4 ,檢查位元為
4 bits 。則漢明碼編碼為 M + R = 8 + 4 = 12 bits 。
2. 編定位碼位置,並將資料位元填入;檢查位元以
C1 、 C2 、 C3 、…的順序,變成 C1() 、 C2() 、 C4() 、
C8() 、…的順序依須補在 P1 、 P2 、 P4 、 P8 、…的位
元位置。如下:
位元位置 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1
位置編碼 1100 1011 1010 1001 1000 0111 0110 0101 0100 0011 0010 0001
資料位元 D8 D7 D6 D5 D4 D3 D2 D1
檢查位元 C8 C4 C2 C1
Ch3-Storage- 26
Hamming code example:
3. 以資料 1001,0110 為例。
位元位置 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1
位置編碼 1100 1011 1010 1001 1000 0111 0110 0101 0100 0011 0010 0001
資料位元 1 0 0 1 0 1 1 0
檢查位元 C8 C4 C2 C1
Ch3-Storage- 27
Hamming code example:
4. 依照位元位置,如有資料位元有 1 者,將位置編碼填入相對位
置之下表:以 Cn 為行做偶同位。
編碼 檢查位元
位元位置 C8 C4 C2 C1
P1
P2
P3
P4
P5 0 1 0 1
P6 0 1 1 0
P7
P8
P9 1 0 0 1
P10
P11
P12 1 1 0 0
偶同位 0 1 1 0
Ch3-Storage- 28
Hamming code example:
5. 將完成的偶同位結果,回寫入上表的對應位置。
位元位置 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1
位置編碼 1100 1011 1010 1001 1000 0111 0110 0101 0100 0011 0010 0001
資料位元 1 0 0 1 0 1 1 0
檢查位元 0 1 1 0
Ch3-Storage- 29
Hamming code example:
1. 依照位元位置,如有資料位元有 1 者,將位置編碼填入相對位置之下表:
以 Cn 為行做 XOR 。若為 0000 表示正確。若使有數值,則表示該位元位
置的資料錯誤。
驗證 檢查位元
位元位置 C8 C4 C2 C1
P1
P2 0 0 1 0
P3
P4 0 1 0 0
P5 0 1 0 1
P6 0 1 1 0
P7
P8
P9 1 0 0 1
P10
P11
P12 1 1 0 0
偶同位 0 0 0 0 Ch3-Storage- 30
Hamming code example:
2. 假設 P5 有錯誤的漢明碼為 1001,0010,1010 ,如下做驗證:
Ch3-Storage- 31
Hamming code example:
驗證 檢查位元
位元位置 C8 C4 C2 C1
P1
P2 0 0 1 0
P3
P4 0 1 0 0
P5
P6 0 1 1 0
P7
P8
P9 1 0 0 1
P10
P11
P12 1 1 0 0
偶同位 0 1 0 1
1. Unsigned representation
2. Signed representation
(1) Sign-and-magnitude representation
(2) Two’s complement representation
(3) One’s complement representation
Ch3-Storage- 33
Storing integers
Integers are whole numbers (numbers without a fractional
part). For example, 134 and −125 are integers, whereas
134.23 and −0.235 are not.
An integer can be thought of as a number in which the
position of the decimal point is fixed:
the decimal point is to the right of the least significant
(rightmost) bit.
For this reason, fixed-point representation is used to store
an integer, as shown in Figure 3.4.
In this representation the decimal point is assumed but not
stored.
Ch3-Storage- 34
Storing integers
Ch3-Storage- 35
Storing integers: Unsigned representation
Unsigned representation
Ch3-Storage- 36
Unsigned integer representation
Example 3.1
Store 7 in an 8-bit memory location using unsigned
representation.
Solution
First change the integer to binary, (111)2. Add five 0s to make a total of
eight bits, (00000111)2. The integer is stored in the memory location.
Note that the subscript 2 is used to emphasize that the integer is binary,
but the subscript is not stored in the computer.
Ch3-Storage- 37
Unsigned integer representation
Example 3.2
Solution
First change the integer to binary (100000010)2. Add seven 0s to make
a total of sixteen bits, (0000000100000010)2. The integer is stored in
the memory location.
Ch3-Storage- 38
Unsigned integer representation
Example 3.3
What is returned from an output device when it retrieves the bit string
00101011 stored in memory as an unsigned integer?
Solution
Using the procedure shown in Chapter 2, the binary integer 00101011
is converted to the unsigned integer 43.
Ch3-Storage- 39
Unsigned integer representation
Figure 3.5 shows what happens if we try to store an integer
that is larger than 24 − 1 = 15 in a memory location that
can only hold four bits.
Ch3-Storage- 41
Sign-and-magnitude representation
Example 3.4
Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 0.
The 8-bit number is stored.
Ch3-Storage- 43
Sign-and-magnitude representation
Example 3.5
Store -28 in an 8-bit memory location using sign-and-magnitude
representation.
Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 1.
The 8-bit number is stored.
Ch3-Storage- 44
Sign-and-magnitude representation
Example 3.6
Retrieve the integer that is stored as 01001101 in sign-and-magnitude
representation.
Solution
Since the leftmost bit is 0, the sign is positive. The rest of the bits
(1001101) are changed to decimal as 77. After adding the sign, the
integer is +77
Ch3-Storage- 45
Sign-and-magnitude representation
Example 3.7
Solution
Since the leftmost bit is 1, the sign is negative. The rest of the bits
(0100001) are changed to decimal as 33. After adding the sign, the
integer is −33.
Ch3-Storage- 46
Sign-and-magnitude representation
別和 Sign-and-magnitude 搞混喔
Example 3.8
The following shows how we take the one’s complement of the
integer 00110110.
Ch3-Storage- 50
One’s Complementing
Example 3.9
The following shows that we get the original integer if we apply the
one’s complement operations twice.
Ch3-Storage- 51
Two’s Complementing
Two’s complementing
• This operation is done in two steps. First, we copy bits from the
right until a 1 is copied; then, we flip the rest of the bits
• 在不產生溢位的前提下,任一數字在進行二的補數運算後,會得到與其正
負號相反之值 (e.g., 6-6, -33)
Example 3.11
The following shows that we always get the original integer if we
apply the two’s complement operation twice.
Ch3-Storage- 53
Two’s Complementing
Example 3.12
Ch3-Storage- 54
Two’s Complementing
Example 3.13
Store −28 in an 8-bit memory location using two’s complement
representation.
Solution
The integer is negative, so after changing to binary, the computer
applies the two’s complement operation on the integer.
Ch3-Storage- 55
Two’s Complementing
Example 3.14
Retrieve the integer that is stored as 00001101 in memory in two’s
complement format.
Solution
The leftmost bit is 0, so the sign is positive. The integer is changed to
decimal and the sign is added.
Ch3-Storage- 56
Two’s Complementing
Example 3.15
Retrieve the integer that is stored as 11100110 in memory using two’s
complement format.
Solution
The leftmost bit is 1, so the integer is negative. The integer needs to
be two’s complemented before changing to decimal.
Ch3-Storage- 57
There is only one zero in two’s complement notation.
Ch3-Storage- 60
Storing reals
Ch3-Storage- 63
Floating-point representation:
scientific notation in decimal system
Example 3.18
The following shows the decimal number
7,452,000,000,000,000,000,000.00
(7.425E21)
The three sections are the sign (+), the shifter (21) and the fixed-point
part (7.425). Note that the shifter is the exponent.
Ch3-Storage- 64
Floating-point representation:
scientific notation in binary system
Example 3.19
Show the number
−0.0000000000000232
in scientific notation (floating-point representation).
Solution
The three sections are the sign (-), the shifter (-14) and the fixed-point
part (2.32). Note that the shifter is the exponent.
Ch3-Storage- 65
Floating-point representation
Example 3.20
Show the number
(101001000000000000000000000000000.00)2
in floating-point representation.
Solution
We use the same idea, keeping only one digit to the left of the decimal
point.
Ch3-Storage- 66
Floating-point representation
Example 3.21
Show the number
−(0.00000000000000000000000101)2
in floating-point representation.
Solution
We use the same idea, keeping only one digit to the left of the decimal
point.
Ch3-Storage- 67
Normalization ( 正規化 )
• To make the fixed part of the representation uniform,
both the scientific method (for the decimal system) and
the floating-point method (for the binary system) use
only one non-zero digit on the left of the decimal
point. This is called normalization.
• In the decimal system this digit can be 1 to 9, while in the
binary system it can only be 1 (there are only 0 or 1 in
binary system). In the following, d is a non-zero digit, x
is a digit, and y is either 0 or 1.
Ch3-Storage- 68
Floating-point representation
After a binary number is normalized, only sign, exponent,
and mantissa ( 尾數 ) are stored
( 尾數 )
The shifting of the decimal point
• Note that the point and the bit 1 to the left of the fixed-
point section are not stored—they are implicit.
• The mantissa is a fractional part that, together with the
sign, is treated like an integer stored in sign-and-
magnitude representation. Ch3-Storage-69
Ch3-Storage- 69
Excess system for the exponent part
The exponent that shows how many bits the decimal point should
be moved to the left or right, is a signed number
Although this could have been stored using two’s complement
representation, a new representation, called the Excess system ( 超
碼系統,超值系統 ), is used instead.
In the Excess system, both positive and negative integers are
stored as unsigned integers.
To represent a positive or negative integer, a positive integer
(called a bias) is added to each number to shift them
uniformly to the non-negative side. The value of this bias is
2m−1 − 1, where m is the size of the memory location to store the
exponent.
Why bias is 2m−1 − 1? See the next slide
Ch3-Storage- 70
Why bias in excess system is 2m−1 − 1 ?
Why bias is 2m−1 − 1 ? Remember that for m-bits excess system, the
smallest value is –(2m-1 − 1)(e.g. -7 in 4-bits). To make all values
positive, the bias should be 2m-1 − 1.
Example 3.22
We can express sixteen integers in a number system with 4-bit
allocation. By adding seven units to each integer in this range, we can
uniformly translate all integers to the right and make all of them
positive without changing the relative position of the integers with
respect to each other, as shown in the figure. The new system is
referred to as Excess-7, or biased representation with biasing value of 7.
Ch3-Storage- 73
IEEE Standard 754
Ch3-Storage- 74
IEEE specifications ( 規格單 ) 754
Ch3-Storage- 75
Example 3.23 Show the Excess_127 (single precision)
representation of the decimal number 5.75.
Solution
a. The sign is positive, so S = 0.
b. Decimal to binary transformation: 5.75 = (101.11)2.
c. Normalization: (101.11)2 = (1.0111)2 × 22.
d. E = 2 + 127 = 129 = (10000001)2, M = 0111. We need to add
nineteen zeros at the right of M to make it 23 bits.
e. The presentation is shown below:
0111
Ch3-Storage- 77
Example 3.25 Show the Excess_127 (single precision)
representation of the decimal number –0.0234375
Solution
a. S = 1 (the number is negative).
b. Decimal to binary transformation: 0.0234375 = (0.0000011)2.
c. Normalization: (0.0000011)2 = (1.1)2 × 2−6.
d. E = –6 + 127 = 121 = (01111001)2 and M = (1)2.
e. Representation:
10111100110000000000000000000000
Ch3-Storage- 78
Example 3.26 The bit pattern (11001010000000000111000100001111)2 is
stored in Excess_127 format. Show the value in decimal.
Solution
a. The first bit represents S, the next eight bits, E and the remaining 23
bits, M.
Ch3-Storage- 79
3-3 STORING TEXT (1/3)
Ch3-Storage- 80
STORING TEXT (2/3)
We can represent each symbol with a bit pattern. In other words, text
such as “CATS”, which is made up from four symbols, can be
represented as four n-bit patterns, each pattern defining a single
symbol (Figure 3.14).
Ch3-Storage- 81
STORING TEXT (3/3)
Ch3-Storage- 82
Codes for storing text
ASCII: Uses patterns of 7-bits to represent most symbols
used in written English text => Modified 8-bits
Unicode: Uses patterns of 16-bits to represent the major
symbols used in languages world side.
32-bits now to represent most symbols used in
languages world wide
See the next slide for a detailed discussion of ASCII and
Unicode
Ch3-Storage- 83
ASCII (America Standard Code for Information
Interchange) 美國資訊交換標準碼
• ASCII ,鑑於資訊交換的重要及為統一文字符號的編碼標準,讓不
同廠牌機型的電腦皆能使用同一套標準化的資訊交換碼,於是美國
國家標準局特別制定了 ASCII 碼,作為資料傳輸的標準碼。
• 早期使用 7 個位元來表示英文字母、數字 0~9 及其他符號,現在
則使用 8 個位元 (Extended ASCII) ,共可表示 256 個不同的文字
與符號,為目前各電腦系統中使用最普遍也最廣泛的英文標準碼,
• 相對於 ASCII code ,中文系統使用最廣泛的內碼則為 Big-5 碼。
• 另外由數家知名軟硬體廠商所合作發展的萬國碼 (UNICODE) ,則
是資料表示的新標準, UNICODE 使用 2 或 4 個 bytes(16 or 32
bits) 來表示每一個符號,共可表示 65536 個或 1677 萬個字元符
號,除英文外,還可以包含數量最多的中文、日文,及全世界各國
的文字符號,讓資訊之交流更無國界。 (Windows98 已開始支援
UNICODE)
Ch3-Storage- 84
Extended ASCII Table (8bits)
85
Ch3-Storage- 81
86
87
Representing Text – Unicode
http://www.unicode.org/
Ch3-Storage- 88
3-4 STORING AUDIO
類比訊號分為無線類比訊號及有線類比訊號兩種,早期第四台業
者不發達的情況下,大部分的家庭都是收無線台(台視、中視、
華視),後來慢慢的家家戶戶都裝了第四台的有線類比訊號,不
僅節目多元化、收訊品質也比無線好上許多
視訊壓縮技術之進步,因此可將類比的畫面信號經數位化處理後,
變成一串數據資料 ( 數位訊號 ) ,亦即只有 0 與 1 兩個數字組成
的「二進位」型式訊號 ( 類似電腦系統中資料處理之訊號 )
透過數位訊號處理,可以消除雜訊和干擾,畫面將會更清、
細 致 , 同 時 也 提 供 身 歷 聲 ( 杜 比 AC3) 品 質 效 果 ,
再經數位調變傳送到家,因此可獲得比原來的類比電視更好
的影像及聲音品質,
因為數位訊號的特性關係,收訊將會比類比訊號要來的容易,
不需要調整半天
Ch3-Storage- 90
An audio signal
Ch3-Storage- 91
Storing audio: Sampling
Ch3-Storage- 93
Encoding
Today the dominant standard for storing audio is MP3 (short for
MPEG Layer 3).
This standard is a modification of the MPEG (Motion Picture
Experts Group) compression method used for video.
It uses 44100 samples per second (44.1KHz) and 16 bits per
sample.
The result is a signal with a bit rate of 705,600 bits (about 86KB)
per second, which is compressed using a compression method that
discards information that cannot be detected by the human ear.
This is called lossy compression ( 失 真 壓 縮 ), as opposed to
lossless compression (more about compression is discussed in
Chapter 15.)
Ch3-Storage- 95
Musical CD
• Musical CD
– Sampling rate 44,100 samples per second
– Each sample is 16 bits (32 bits for stereo)
– Each second of music recorded in stereo requires 44100*32 =
1,411,200 bits
Ch3-Storage- 96
3-5 STORING IMAGES
Color depth
The number of bits used to represent a pixel, its color
depth, depends on how the pixel’s color is handled by
different encoding techniques.
The perception of color is how our eyes respond to a beam
of light ( 光束 ).
Our eyes have different types of photoreceptor cells ( 視感
神經細胞 ): some respond to the three primary colors red,
green and blue (often called RGB), while others merely
respond to the intensity ( 強度 ) of light.
Ch3-Storage- 97
True-Color ( 全彩 )
Ch3-Storage- 98
一般「解析度」,指的是 50 * 50 px 這樣寬 × 高的表達方式,解析度一般
會用來表示圖片、螢幕的寬、高畫素數,例如 1920 * 1080 px 的螢幕,如
果把時間往回推三十年,當時的螢幕解析度只有 640 * 480 px 或更低,在
當時的螢幕上,像素點是清晰可見的,但數位可以放大或縮小:
Ch3-Storage- 99
同樣是 15 英吋大小的螢幕,一個的解析度是超古老的 640 * 480 px ,另一
個是 1,920 * 1,440 px ,兩者的 X 、 Y 方向的像素數各差了三倍,這之間
的差異可以用不同年代的同一款遊戲的畫面來感受。
Ch3-Storage- 100
解析度 (resolution) ,單位長度內像素的數量。解析度採用的長度單位是
英寸 (inch) 。解析度嚴格來說又可以分為 ppi 與 dpi : ppi 為影像解析度
(pixels per inch) 、 dpi 為輸出解析度 (dots per inch) 。 pixel 是數位影像的
像素,而 dot 是列印成品的圖點。
Ch3-Storage- 101
在 2010 年發佈 iPhone4 的時候, Jobs 說明當你拿着手機距離 10-12 英吋
時, 326 的 ppi 是我們肉眼能分辨像素的極限。也就是滿足 326ppi 觀屏距
離 10-12 英寸時,人眼察覺不出大於 326ppi ?
Ch3-Storage- 102
Digital Camera
2,500 像素,指的是像素的總數量,一般會用來表示相片或相機感光元件的像
素數量,例如一張 2,500 像素的相片,或者一台百萬像素的相機,而「百萬像
素」( mega pixel )又被縮寫成 MP ,所以 iPhone 的相機是 12MP ,意即
iPhone 的感光元件的有效面積大約是一千兩百萬像素,在不裁切或縮放的情況
下它可以輸出 4,032 * 3,024 ≈ 1,200 萬像素的照片。
Ch3-Storage- 103
Standards for image encoding
Ch3-Storage- 104
3-6 STORING VIDEO
Video is a representation of images (called frames) over
time. A movie consists of a series of frames shown one after
another.
In other words, video is the representation of information
that changes in space and in time.
So, if we know how to store an image inside a computer,
we also know how to store video: each image or frame is
transformed into a set of bit patterns and stored. The
combination of the images then represents the video.
Ch3-Storage- 105