Real Number Representations

Chapter 4
Real Number Representations

IEEE754 Floating Point (FP)
A floating-point (FP) representation is used to represent real numbers.
• The floating-point representation is encoded in a finite number of
bits.
• The IEEE (Institute of Electrical and Electronics Engineers)

developed the Floating-point Standard 754 to represent the real
numbers.
• It was developed in 1985 to standardize computation among the

various computer manufactures.
• The IEEE 754 dictates the precision, accuracy, and arithmetic

operations that must be implemented in conforming processors.
IEEE754 Floating Point (FP)
Aside
A fixed- point (FXP) representation is used to represent signed integer
numbers.
• The fixed-point representation is encoded in a finite number of bits.
• The IEEE developed the Fixed-point Standard 754 to represent the

Integer numbers.
S | 2’s Complement integer

n-1 n-2 0
Single Precision FXP Number: 32- bit

Double Precision FXP Number: 64- bit
Return
IEEE754 Binary Floating- Point (BFP)
The representation of the IEEE 754 BFP number consists of three parts:
BFP Number: S | EXPONENT(E) | FRACTION (F)
n-1 0
Consider a IEEE 754 BFP number X, it will represented by 3 fields:
i) Sign Sx : is a sign bit and indicates whether the FP number X is positive or

negative, (Sx =0 : means X is positive, Sx =1 :means X is negative).
ii) Exponent Ex: Exponent Ex is used to adjust the position of the binary point
(as opposed to a "decimal" point).
- The number of bits of the exponent field (Ex) depends on the format used.
- The exponent is a signed integer value that can be represented by biased (B).
The bias B is given by

B = 2ne -1 – 1 (4.1)
Where 𝒏𝐞 : is the number of exponent bits in FP format.
Note: Using the biased B is very important to make all exponents Ex in
the BFP representation, to be positive number.
iii) Magnitude Mx: IEEE754 BFP standard also calls the Magnitude
( Mx) to be a Normalized Significand (or Mantissa)
What is the meaning of Normalized Significand Mx ?

It means that the biased exponent Ex is chosen such that the highest
order bit (Integer bit) in the significand (Mx ) is a 1 (except for zero value).
Thus, the normalized significand is represented by

Mx = 1.F with or 𝟏 ≤ 𝑴𝑿 < 𝟐 (4.2)
where
F: is the fraction of the real number and it consists of (f- bits). The
number of bits of F depends on the format used.
F = 𝑓−1 𝑓−2 𝑓−3 … … . . 𝑓−𝑚
Thus the normalized mantissa is
𝑴𝒙 = 𝟏. 𝑭 = 𝟏. 𝒇−𝟏 𝒇−𝟐 𝒇−𝟑 … … . . 𝒇−𝒎 (4.3)
Note: The most significant 1 (integer bit) is hidden bit (i.e. this integer
bit (1) is not stored in IEEE754 BFP registers)
Normalized Representation of IEEE754 BFP Number
S|EXPONENT| FRACTION
𝑆𝑋 𝐸𝑋 𝑭𝒙
The three fields are packed into one word with the order of fields:
Sx , Ex , and 𝑭𝒙 , such that:
𝑋 = (−1)𝑺𝒙 . (1. 𝐹𝑥 ). 2𝑬𝒙 − 𝑩 Normalized Number (4.4)
let 𝒆𝒙 = 𝑬𝒙 − 𝐵 𝒆𝒙 : the unbiased exponent

𝑬𝑿 : the biased exponent
B : Bias (constant value. It is value depends on
the type of standard IEEE754 format to
represent BFP number.
Normalized Number
𝑋 = (−1)𝑺𝒙 . (1. 𝐹). 2𝒆𝒙 (4.5)

IEEE754 Binary Floating Point (BFP) Format
IEEE754 storage format specifies how a BFP number is stored in a
memory and in the registers of BFP unit.
• The IEEE 754 BFP standard defines two basics formats:

a) Single Precision (32- bit)
b) Double Precision (64- bit)
Extended formats for each of these two basics formats are also used.
Figure (4.1) shows the data formats supported by the IEEE 754.
31 30 …. 23 22 0
43 42 31 30 0
63 62 …. 52 51 ……………………… 0
79 78 …. … .. 63 62 ……………………… 0
Fig.(4.1) Data formats supported by the IEEE 754 FP

standard representation.
IEEE754 BFP Formats
SingleShort
Precision Format:
(32-bit) (32- bit)
format
8 bits, 23 bits for fractional part

bias = 127, (plus hidden 1 in integer part)
–126 to 127
Sign Exponent Significand

11 bits,
bias = 1023, 52 bits for fractional part
–1022 to 1023 (plus hidden 1 in integer part)
LongPrecision
Double (64-bit) Format:
format (64-bit)
Special Values
These are the values that are not representable in BFP system, but are
useful for representing ±∞ and Not a Number (NaN).
NaN: is a special value, it is useful for representing undefined results,

such as (0/0) and the square root of negative number, or when variables
are uninitialized.
Example 1: If the biased exponent of X is:

𝑬𝒙 = 𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏 (for Single precision format: 𝑬𝒙 (𝟖 − 𝒃𝒊𝒕) )
𝑬𝒙 = 𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏 (for Double precision format: 𝑬𝒙 (𝟏𝟏 − 𝒃𝒊𝒕) )
Then X is NaN
Special Values
Example 2: Suppose Y is represented in single precision BFP format.
* If all the bits of the biased exponent 𝑬𝒀 are equal 1 and all the fraction
bits (𝑭) are equal 0;
then the number Y will be either −∞ or +∞ depending on the sign 𝑺𝒀 :

Biased Exponent is:
𝑬𝒀 = 𝟏𝟏𝟏𝟏𝟏𝟏𝟏𝟏 (𝑬𝒀 : (𝟖 − 𝒃𝒊𝒕) ) ≡ (𝟐𝟓𝟓)𝟏𝟎
Fraction
𝑭 = 𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟎 … 𝟎𝟎𝟎𝟎00 (𝑭: (𝟐𝟑 − 𝒃𝒊𝒕) )
Then Y : ±∞
Features of IEEE754 BFP Floating-Point Formats
(B)
X
Exceptions
Five types of exceptions are defined in IEEE 754 BFP Standard. By
default, these exceptions set flags and computations continue. The
exceptions are:
1- Overflow (exponent): occurs when the result is too large to be
represented.
2- Underflow ((exponent): occurs when the nonzero magnitude of the
result is too small to be represented.
3- Division by Zero.
4- Inexact: occurs when infinite- precision result different from FP
number.
5- Invalid: set when a NaN result is produced.

Real Number Representations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Real Number Representations

Uploaded by

Copyright:

Available Formats

Chapter 4

Real Number Representations

• The IEEE (Institute of Electrical and Electronics Engineers)

• It was developed in 1985 to standardize computation among the

• The IEEE 754 dictates the precision, accuracy, and arithmetic

• The IEEE developed the Fixed-point Standard 754 to represent the

S | 2’s Complement integer

Single Precision FXP Number: 32- bit

i) Sign Sx : is a sign bit and indicates whether the FP number X is positive or

The bias B is given by

What is the meaning of Normalized Significand Mx ?

Thus, the normalized significand is represented by

𝑴𝒙 = 𝟏. 𝑭 = 𝟏. 𝒇−𝟏 𝒇−𝟐 𝒇−𝟑 … … . . 𝒇−𝒎 (4.3)

𝑋 = (−1)𝑺𝒙 . (1. 𝐹𝑥 ). 2𝑬𝒙 − 𝑩 Normalized Number (4.4)

let 𝒆𝒙 = 𝑬𝒙 − 𝐵 𝒆𝒙 : the unbiased exponent

𝑋 = (−1)𝑺𝒙 . (1. 𝐹). 2𝒆𝒙 (4.5)

• The IEEE 754 BFP standard defines two basics formats:

Fig.(4.1) Data formats supported by the IEEE 754 FP

8 bits, 23 bits for fractional part

Sign Exponent Significand

NaN: is a special value, it is useful for representing undefined results,

Example 1: If the biased exponent of X is:

Example 2: Suppose Y is represented in single precision BFP format.

then the number Y will be either −∞ or +∞ depending on the sign 𝑺𝒀 :

You might also like