Professional Documents
Culture Documents
2.data - Representation - UNIT 2-2
2.data - Representation - UNIT 2-2
2.data - Representation - UNIT 2-2
One easy direct method in Decimal to binary conversion for integer part is to first write the
place values as:
Step 1: Take the integer part e.g. 43, find the next lower or equal binary place value number, in
this example it is 32. Place 1 at 32.
Step 2: Subtract the place value from the number, in this case subtract 32 from 43, which is 11.
Step 3: Repeat the two steps above till you get 0 at step 2.
Step 4: On getting a 0 put 0 at all other place values.
ALPHANUMERIC REPRESENTATION
A set containing alphabets (in both cases), the decimal digits (10 in number) and special
characters (roughly 10-15 in numbers) consist of at least 70-80 elements. ASCII One
such standard code that allows the language encoding that is popularly used is ASCII
(American Standard Code for Information Interchange).
This code uses 7 bits to represent 128 characters, which include 32 non-printing control
characters, alphabets in lower and upper case, decimal digits, and other printable
characters that are available on your keyboard. ASCII was extended to 8 bits to represent
256 characters (called Extended ASCII codes).
The major strength of ASCII is that it is quite elegant in the way it represents characters.
It is easy to write a code to manipulate upper/lowercase ASCII characters and check for
valid data ranges because of the way of representation of characters. In the original ASCII
the 8th bit (the most significant bit) was used for the purpose of error checking as a
check bit.
Developed by: Thilak Reddy, CSE Dept, RGUKT Page 5
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC) is a character-
encoding format used by IBM mainframes. It is an 8-bit code and is NOT Compatible to
ASCII. It had been designed primarily for ease of use of punched cards. This was
primarily used on IBM mainframes and midrange systems such as the AS/400.
words to obtain 1’s complement of a binary number, we only have to change all the 1’s of the
number to 0 and all the zeros to 1’s. This can be done by complementing each bit of the binary
number.
The 1’s complement of 1010 is
2’s complement: 2's complement of a binary number is 1 added to the 1's complement of the
binary number. Examples: 2's complement of "0111" is "1001" 2's complement of "1100" is
"0100"
Adding 1 in 1’s complement will generate the 2’s complement
Arithmetic addition:
The complexity of arithmetic addition is dependent on the representation, which has
been followed. Let us discuss this with the help of following example.
Example:
Add 25 and -30 in binary using 8 bit registers, using:
• Signed magnitude representation
• Signed 1’s complement
• Signed 2’s complement
To do the arithmetic addition with one negative number only, we have to check the
magnitude of the numbers.
The number having smaller magnitude is then subtracted from the bigger number and
the sign of bigger number is selected.
The implementation of such a scheme in digital hardware will require a long sequence of
control decisions as well as circuits that will add, compare and subtract numbers.
Adder:
An adder is a kind of calculator that is used to add two binary numbers. When I say, calculator, I
don’t mean one with buttons, this one is a circuit that can be integrated with many other
circuits for a wide range of applications. There are two kinds of adders;
1. Half adder
2. Full adder
Half Adder
Developed by: Thilak Reddy, CSE Dept, RGUKT Page 15
With the help of half adder, we can design circuits that are capable of performing simple
addition with the help of logic gates.
Let us first take a look at the addition of single bits.
0+0 = 0
0+1 = 1
1+0 = 1
1+1 = 10
These are the least possible single-bit combinations. But the result for 1+1 is 10. Though this
problem can be solved with the help of an EXOR Gate, if you do care about the output, the sum
result must be re-written as a 2-bit output.
Thus the above equations can be written as
0+0 = 00
0+1 = 01
1+0 = 01
1+1 = 10
Here the output ‘1’of ‘10’ becomes the carry-out. The result is shown in a truth-table below.
‘SUM’ is the normal output and ‘CARRY’ is the carry-out.
INPUTS OUTPUTS
A B SUM CARRY
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
From the equation, it is clear that this 1-bit adder can be easily implemented with the help of
EXOR Gate for the output ‘SUM’ and an AND Gate for the carry. Take a look at the
implementation below.
Full Adder
This type of adder is a little more difficult to implement than a half-adder. The main difference
between a half-adder and a full-adder is that the full-adder has three inputs and two outputs.
The first two inputs are A and B and the third input is an input carry designated as CIN. When a
full adder logic is designed we will be able to string eight of them together to create a byte-wide
adder and cascade the carry bit from one adder to the next.
The output carry is designated as COUT and the normal output is designated as S. Take a look at
the truth-table.
INPUTS OUTPUTS
A B CIN COUT S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
From the above truth-table, the full adder logic can be implemented. We can see that the output
S is an EXOR between the input A and the half-adder SUM output with B and CIN inputs. We
must also note that the COUT will only be true if any of the two inputs out of the three are HIGH.
Thus, we can implement a full adder circuit with the help of two half adder circuits. The first
will half adder will be used to add A and B to produce a partial Sum. The second half adder logic
can be used to add CIN to the Sum produced by the first half adder to get the final S output. If
any of the half adder logic produces a carry, there will be an output carry. Thus, COUT will be an
Using ripple carry adder, this addition will be carried out as shown by the following logic
diagram-
The multiplier and multiplicand bits are loaded into two registers Q and M.
A third register A is initially set to zero.
C is the 1-bit register which holds the carry bit resulting from addition.
Now, the control logic reads the bits of the multiplier one at a time. If Q 0 is 1, the
multiplicand is added to the register A and is stored back in register A with C bit used for
carry.
Then all the bits of CAQ are shifted to the right 1 bit so that C bit goes to An-1, A0 goes to
Qn-1 and Q0 is lost. If Q0 is 0, no addition is performed just do the shift.
The process is repeated for each bit of the original multiplier. The resulting 2n bit
product is contained in the QA register.
Multiplier and multiplicand are placed in Q and M register respectively. There is also one
bit register placed logically to the right of the least significant bit Q 0 of the Q register and
designated as Q-1.
The result of multiplication will appear in A and Q resister.
A and Q-1 are initialized to zero if two bits (Q0 and Q-1) are the same (11 or 00) then all
the bits of A, Q and Q-1 registers are shifted to the right 1 bit.
If the two bits differ then the multiplicand is added to or subtracted from the A register
depending on weather the two bits are 01 or 10.
Following the addition or subtraction the arithmetic right shift occurs. When count
reaches to zero, result resides into AQ in the form of signed integer [-2n-1*an-1 + 2n-2*an-2 +
…………… + 21*a1 + 20*a0].
Division is somewhat more than multiplication but is based on the same general
principles. The operation involves repetitive shifting and addition or subtraction.
The bits of the dividend are examined from left to right, until the set of bits examined
represents a number greater than or equal to the divisor; this is referred to as the divisor
being able to divide the number.
Until this event occurs, 0s are placed in the quotient from left to right. When the event
occurs, a 1 is placed in the quotient and the divisor is subtracted from the partial
dividend. The result is referred to as a partial remainder.
The division follows a cyclic pattern. At each cycle, additional bits from the dividend are
appended to the partial remainder until the result is greater than or equal to the divisor.
The divisor is subtracted from this number to produce a new partial remainder. The
process continues until all the bits of the dividend are exhausted.
Algorithm:
Step 1: Initialize A, Q and M registers to zero, dividend and divisor respectively and counter to n
where n is the number of bits in the dividend.
Step 2: Shift A, Q left one binary position.
Step 3: Subtract M from A placing answer back in A. If sign of A is 1, set Q 0 to zero and add M
back to A (restore A). If sign of A is 0, set Q0 to 1.
Step 4: Decrease counter; if counter > 0, repeat process from step 2 else stop the process. The
final remainder will be in A and quotient will be in Q.
In computers, floating-point numbers are represented in scientific notation of fraction (F) and
exponent (E) with a radix of 2, in the form of F×2^E. Both E and F can be positive as well as
negative. Modern computers adopt IEEE 754 standard for representing floating-point numbers.
There are two representation schemes: 32-bit single-precision and 64-bit double-precision.
There are three parts in the floating-point representation:
The sign bit (S) is self-explanatory (0 for positive numbers and 1 for negative numbers).
For the exponent (E), a so-called bias (or excess) is applied so as to represent both
positive and negative exponent. The bias is set at half of the range. For single precision
Normalized Form
In normalized form, the radix point is placed after the first non-zero digit, e,g.,
9.8765D×10^-23D, 1.001011B×2^11B. For binary number, the leading bit is always 1,
and need not be represented explicitly - this saves 1 bit of storage.
In IEEE 754's normalized form:
For single-precision, 1 ≤ E ≤ 254 with excess of 127. Hence, the actual exponent is from -
126 to +127. Negative exponents are used to represent small numbers (< 1.0); while
positive exponents are used to represent large numbers (> 1.0).
N = (-1)^S × 1.F × 2^(E-127)
For double-precision, 1 ≤ E ≤ 2046 with excess of 1023. The actual exponent is from -
1022 to +1023, and
N = (-1)^S × 1.F × 2^(E-1023)
Example:
Suppose that the 32-bit pattern is 1 10000001 01100000000000000000000, with:
S=1
E = 10000001
F = 011 0000 0000 0000 0000 0000
In the normalized form, the actual fraction is normalized with an implicit leading 1 in the form
of 1.F.
Fraction Value: 1.011 0000 0000 0000 0000 0000 = 1 + 1×2^-2 + 1×2^-3 = 1.375
Sign bit S=1
In normalized form, the actual exponent is E-127 (so-called excess-127 or bias-127). This is
because we need to represent both positive and negative exponent. With an 8-bit E, ranging
from 0 to 255, the excess-127 scheme could provide actual exponent of -127 to 128.
De-Normalized Form
Normalized form has a serious problem, with an implicit leading 1 for the fraction, it cannot
represent the number zero! De-normalized form was devised to represent zero and other
numbers.
For E=0, the numbers are in the de-normalized form. An implicit leading 0 (instead of 1) is used
for the fraction; and the actual exponent is always -126. Hence, the number zero can be
represented with E=0 and F=0 (because 0.0×2^-126=0).
We can also represent very small positive and negative numbers in de-normalized form with
E=0.
For example, if S=1, E=0, and F=011 0000 0000 0000 0000 0000. The actual fraction is
0.011=1×2^-2+1×2^-3=0.375D. Since S=1, it is a negative number. With E=0, the actual
exponent is -126. Hence the number is -0.375×2^-126 = -4.4×10^-39, which is an extremely
small negative number (close to zero).
Example:
What is the decimal value of this single precision float?
Solution:
Sign =1 (Negative)
Exponent = (0111 1100)2 = 124
=E-Bias = 124 – 127 =-3
Significant = (1.01000…0000)2 = 1 + 2-2 = 1.25 (i.e., 1. is implicit)
Value in decimal = -1.25 x 2-3 = -0.15625
Example:
Convert -0.8125 to binary single precision floating point representation?
Solution:
Fraction bits can be obtained using multiplication by 2
0.8125x2=1.625
0.625x2=1.25
0.25x2=0.5
0.5x2=1.0
Example:
What is the decimal value of this Double Precision float?
Sol:
Sign = 0 (positive)
Value of exponent = (1000 0000 101)2-Bias
=1029-1023
=6
Value of double float = (1.00101010…..0)2x26 (i.e., 1. is implicit)
(100101010.100000……..00)2
=74.5
Example:
What is the decimal value of Double precision float?
Solution:
Sign = 1 (negative)
Value of exponent = (0111 1111 000)2-Bias
= 1016-1023=-7
Value for double float = (1.10000……000)x2-7 (i.e 1. is implicit)
(0.0000001100000….000)2
= - 0.01171875
Example3:
Convert -0.8125 to binary in double precision floating point representation?
Solution:
Fraction bits can be obtained using multiplication by 2
0.8125x2=1.625
0.625x2=1.25
0.25x2=0.5
0.5x2=1.0
Special Values
Zero: Zero cannot be represented in the normalized form, and must be represented in
denormalized form with E=0 and F=0. There are two representations for zero: +0 with
S=0 and -0 with S=1.
Infinity: The value of +infinity (e.g., 1/0) and -infinity (e.g., -1/0) are represented with an
exponent of all 1's (E = 255 for single-precision and E = 2047 for double-precision), F=0,
and S=0 (for +INF) and S=1 (for -INF).
When the exponent bits are all ones and the fraction bits are all 0 then the resulting value
represents infinity.
Not a Number (NaN): NaN denotes a value that cannot be represented as real number
(e.g. 0/0). NaN is represented with Exponent of all 1's (E = 255 for single-precision and E
= 2047 for double-precision) and any non-zero fraction.
When exponent bits are all ones but the fraction value is non zero then the resulting
value is said to be NaN which is short for Not a Number. You get this value when you
perform invalid operations like dividing zero by zero, subtracting infinity from infinity
etc…
FLOATING-POINT ARITHMETIC
Floating point arithmetic (Addition, Subtraction, Multiplication, Division) :
For addition and subtraction, it is necessary to ensure that both operands have the same
exponent value. This may require shifting the radix point on one of the operands to achieve
alignment. Multiplication and division are more straightforward. A floating-point operation
may produce one of these conditions:
Exponent overflow: A positive exponent exceeds the maximum possible exponent value.
In some systems, this may be designated as +infinity or –infinity
Developed by: Thilak Reddy, CSE Dept, RGUKT Page 36
Exponent underflow: A negative exponent is less than the minimum possible exponent
value (e.g., is less than). This means that the number is too small to be represented, and it
may be reported as 0.
Significand underflow: In the process of aligning significands, digits may flow off the
right end of the significand. As we shall discuss, some form of rounding is required.
Significand overflow: The addition of two significands of the same sign may result in a
carry out of the most significant bit. This can be fixed by realignment, as we shall explain.
Practice Problems:
Find the substraction of given two floating values (Single Precision Floating Point)?
+1.0000 0000 1011 0001 0001 101 x 2-6
-1.0000 0000 0000 0001 0011 010 x 2-1
Multiplication:
Solution:
1. Add the exponents of the operands
Exponent = -4-2 = -6
2. The biased representation:
B.E = -6+127 (Single precision)
Biased Exponent = 121
3. Sign bit of the product can be computed independently.
Sign bit of product = 1 XOR 0 = 1 (Negative)
4. Multiply the significands:
(Multiplicand) 1.110 1000 0100 0000 1010 0001
Developed by: Thilak Reddy, CSE Dept, RGUKT Page 41
(Multiplier) 1.100 0000 0001 0000 0000 0000
10.1011100011111011111100110010100001000000000000
5. Normalize the product:
-10.1011100011111011111100110010100001000000000000 x 2-6
Shift right and increment exponent because of carry bit
= -1. 0101110001111101111110011001010000100000000000 x2-5
6. Round to the nearest Even: (Keep only 23 fraction bits)
1. 0101 1100 0111 1101 1111 100 | 1 100…… x 2-5
Round bit = 1, Sticky bit = 1
Final result = -1. 0101 1100 0111 1101 1111 101 x 2-5
Division:
o 1 Test for 0:
• If the divisor is 0: report error;
• Dividend is 0: results in 0.
o 2 Divisor exponent is subtracted from the dividend exponent;
o 3 Divide the significands;
o 4 Result is normalized;