Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

Primitive Data

Types

Dr.S.S.Shehaby

1
Primitive Data Types
• A data type is a classification of data, which can store a
specific type of information.
• Primitive data types are predefined types of data, which are
supported by the programming language, and operations on
primitive data types are mostly implemented by Hardware
• Examples:
• Character (character, char); some languages treats them as integers
like C
• Integer (integer, int, short, long, byte) with a variety of
precisions;
• Floating-point number (float, double, real, double precision);
• Fixed-point number (fixed) with a variety of precisions and a
programmer-selected scale.
• Boolean, logical values true and false.
• Reference (also called a pointer or handle), a small value referring
to another object's address in memory, possibly a much larger one.
• Non-primitive data types are not defined by the programming
language, but are instead created by the programmer.

2
Integers Representation

IIII IIII IIII IIII IIII II Sticks; Unary Code (ANALOG REPRESENTATION)
Twenty Seven Alphabet: 26 letters
XXVII Alphabet: {I,V,X,L,C,…. }
27 Arabic Decimal, Alphabet {0,1,2,…9}
11011 Arabic Radix 2, Alphabet {0,1}

3
Integers (Radix b)
•k digits radix b:
xk–1 xk–2 ... x1 x0
• Min : 0 0 0 0
• Max (b-1)(b-1) .. (b-1)(b-1)
• b=10: 9 9 9 9
= 10k -1
any b: = bk -1
• Range = (0, bk -1), cardinality: |Range|= bk
if b=2: 2k

4
Signed Integers (Sign Included)
• k digits radix b:  xk–2 xk–3 ... x1 x0
-----------------------------------------------
xk–1 xk–2 ... x1 x0
• Min : - (b-1) .. (b-1)(b-1)
• Max + (b-1) .. (b-1)(b-1)
• b=10: + 9 9 9
= 10k-1 -1
any b: = bk-1 -1
• Range = (-bk-1 +1, bk-1 -1)
• cardinality: |Range|= 2* bk-1 -1
if b=2:|Range|= 2k-1 -1
5
Complement Operations for m-Digits,
Base b, Numbers
• Radix Complement of an m digits number X base=b is:
RC(X) = bm-X mod bm .
• 10’s complement of 0123 = (10000-123) mod 10000 = 9877
• 10’s complement of 0000 = (10000-000) mod 10000 = 0
• 2’s complement of 0111 = (10000-111) mod 10000 = 1001
• RC(X)+X=bm = ZERO mod bm !!!
• RC(X) in modulo bm groups represent –X
• To simplify RC(X): RC(X)=1+DC(X) modulo bm where
DC(X) is the Diminished Complement of X defined as:
• DC(X)=(bm-1)- X
• 9’s complement of 0123 = (9999-123) = 9876
• 9’s complement of 0000 = (9999-000) = 9999
• 1’s complement of 0111 = (1111-111) = 1000

6
Complement Operations for m-Digit
Base b Numbers
• 9s complement of 1256=9999-1256
=8743
• 10s complement =8744 (m=4)
• 8744+1256 = 10000 = 0 (m=4)
• In General: x + xc = bm
=0 mod bm

7
4 digits binary Numbers (TWO’s
Complement
Decimal binary Two’s Comp
0 0000
1 0001 (-1) 1111
2 0010 (-2) 1110
3 0011 (-3) 1101
4 0100 (-4) 1100
5 0101 (-5) 1011
6 0110 (-6) 1010
7 0111 (-7) 1001
8 (-8) 1000

8
n Binary Digits Two’s Complement
• Range: [-2n-1, 2n-1-1]
• |Range| = 2n
• Subtraction operation can be performed as an
addition of a number with the two’s complement
of the other
• Unique representation of the 0 (-0)

9
Data types Represented as
integers
• Characters: Encodings ASCII, UTF-8, UTF-16, ...
etc.
• Enumerations like [Saturday, Sunday, ...Friday]
• Samples signals (like sound, sensor data, ...)
after quantification
• Color Pixels in images

• And others ...

10
Some Coding
void printByte(unsigned char x){
unsigned char i,mask=0x80;// 1<<7 , 128
for (i=0;i<8;i++,mask>>=1) if (mask&x) printf("1");else
printf("0");
printf(" ");
}
void printBytesBigEndian(unsigned char *pt,int n){
n--;int i=n;
while (i>=0) printf ("%02x ",*(pt+i--));
printf(": ");
while (n>=0) printByte(*(pt+n--));
printf("\n");
}
11
Some coding
int i,x=256+7;
printf("integers %d (%x) %d (%x):\n",x,x,-x,-x);
printBytesBigEndian((unsigned char *) &x,sizeof(x));
x=-x;printBytesBigEndian((unsigned char *) &x,sizeof(x));
printf("Min Max !\n");
unsigned char *pt=(unsigned char *)&x;
*pt++=0xff;*pt++=0xff;*pt++=0xff;*pt++=0x7f;
printf("%d (%x):",x,x);printBytesBigEndian((unsigned char *)
&x,sizeof(x));
pt--;*pt=0xff;printf("%d (%x):",x,x);printBytesBigEndian((unsigned
char *) &x,sizeof(x));
pt=(unsigned char *)&x;
*pt++=0;*pt++=0;*pt++=0;*pt++=0x80;printf("%d (%x) : “ ,x,x) ;
printBytesBigEndian((unsigned char *) &x,sizeof(x));

12
Some coding

int i,x=256+7;
printf("integers %d (%x) %d (%x):\n",x,x,-x,-x);
printBytesBigEndian((unsigned char *) &x,sizeof(x));

integers 263 (107) -263 (fffffef9):


00 00 01 07 : 00000000 00000000 00000001 00000111
ff ff fe f9 : 11111111 11111111 11111110 11111001

pt+3 pt+2 pt+1 pt=&x

13
Some coding
printf("Min Max !\n");
unsigned char *pt=(unsigned char *)&x;
*pt++=0xff;*pt++=0xff;*pt++=0xff;*pt++=0x7f;
printf("%d (%x):",x,x);printBytesBigEndian((unsigned char *)
&x,sizeof(x));
pt--;*pt=0xff;printf("%d
(%x):",x,x);printBytesBigEndian((unsigned
char *) &x,sizeof(x));
pt=(unsigned char *)&x;
*pt++=0;*pt++=0;*pt++=0;*pt++=0x80;printf("%d (%x) : “ ,x,x) ;
printBytesBigEndian((unsigned char *) &x,sizeof(x));
Min Max !
2147483647 (7fffffff):7f ff ff ff : 01111111 11111111 11111111 11111111
-1 (ffffffff):ff ff ff ff : 11111111 11111111 11111111 11111111
-2147483648 (80000000):80 00 00 00 : 10000000 00000000 00000000 00000000

14 pt+3 pt+2 pt+1 pt=&x


Real Numbers: Fixed Point
1 (sign) n digits (integer part) m digits (fractional part)
• Number is represent as 3 fields:
• s for sign, n digits for integer part, m for fraction
• Total storage n+m+1
• Max number = bn-b-m
• Range [-(bn-b-m), bn-b-m]
• Precision m+n digits (Radix b)
• Maximum chopping error: b-m (constant)
• RCE (Relative Chopping Error)of number X = error/X
• b=10: example x= 3333.333 (n=4, m=3)
• precision 7, range[-9999.999,9999.999], chopping
error<0.001.
• Max Relative chopping error of x=b-m/x
Max RCE
Max Chopping error
15
x
Floating Point
• Number is represent as 3 fields:
• s is sign, e is exponent, and f is the fraction
(significand or mantissa)
• Value=(-1)s * f * 2 e

16
Floating Point Normalization
• Decimal: 222(*100)=22.2*101= 0.222*103
• Standard form: 1 ≤ fraction < base (2.22*102)
• Binary: 1011 -> 1.011*23 is the standard form
• 0.1011 -> 1.011*2-1 is the standard form
• So if physical n-bits are physically saved they
represent logically (n+1)bits [BECAUSE FRACTION
AWAYS STARTS WITH 1.0 for radix=2]

17
Max Exp=2me-1 - 1 Max Fraction=2-2-mf
Min Exp=-(Max Exp-1) Min Fraction=1.0

1.

18
The IEEE Floating Point Standars
Short (32-bit) format IEEE 754-2008 Standard
(supersedes IEEE 754-1985)
8 bits, 23 bits for fractional part Also includes half- &
bias = 127, (plus hidden 1 in integer part) quad-word binary, plus
–126 to 127 some decimal formats

Sign Exponent Significand


11 bits,
bias = 1023, 52 bits for fractional part
–1022 to 1023 (plus hidden 1 in integer part)

Long (64-bit) format


Representation must support:
The IEEE standard floating-point number ZERO
NAN
representation formats.
INFINITY (why [-126,127])
19
Extended precision

20
Explain
float y ;//=1.75;
y=1.75/2;
for (i=0;i<5;i++,y*=2.0) {
printf("float %8.3f ::: ",y);
printBytesBigEndian((unsigned char *) &y,sizeof(y));
} pt+3 pt+2 pt+1 pt=&y
float 0.875 ::: 3f 60 00 00 : 00111111 01100000 00000000 00000000

1.75/2
=
0.875
=
21 -1
1.75*2
Explain
float y ;//=1.75;
y=1.75/2;
for (i=0;i<5;i++,y*=2.0) {
printf("float %8.3f ::: ",y);
printBytesBigEndian((unsigned char *) &y,sizeof(y));
} pt+3 pt+2 pt+1 pt=&y
float 0.875 ::: 3f 60 00 00 : 00111111 01100000 00000000 00000000

Exponent
1.75/2 + =(01111110) Mantissa=1.1100000...
= 2 =1.75
0.875 -127 (Normalized!)
= =126-127
22 -1
1.75*2 =-1
Explain
float y ;//=1.75;
y=1.75/2;
for (i=0;i<5;i++,y*=2.0) {
printf("float %8.3f ::: ",y);
printBytesBigEndian((unsigned char *) &y,sizeof(y));
} pt+3 pt+2 pt+1 pt=&y
float 0.875 ::: 3f 60 00 00 : 00111111 01100000 00000000 00000000
float 1.750 ::: 3f e0 00 00 : 00111111 11100000 00000000 00000000
float 3.500 ::: 40 60 00 00 : 01000000 01100000 00000000 00000000
float 7.000 ::: 40 e0 00 00 : 01000000 11100000 00000000 00000000
float 14.000 ::: 41 60 00 00 : 01000001 01100000 00000000 00000000

23
Explain: Max/MIN
float y; unsigned char*pt=&y;*pt++=0xff;*pt++=0xff;*pt++=0x7f;*pt=0x7f;
printf("ABS MAX: 2*powf(2,127)=%e %e\n",2*powf(2,127),y);
printBytesBigEndian((unsigned char *) &y,sizeof(y));

y=1*powf(2,-126); Zero Representation ?


printf("ABS MIN: 1*powf(2,-126)=%e %e\n",y); NAN
printBytesBigEndian((unsigned char *) &y,sizeof(y)); INF

254-127=127
pt+3 pt+2 pt+1 pt=&y
ABS MAX: 2*powf(2,127)=3.402824e+038 3.402824e+038
7f 7f ff ff : 01111111 01111111 11111111 11111111
ABS MIN: 1*powf(2,-126)=1.175494e-038
00 80 00 00 : 00000000 10000000 00000000 00000000
1-127=-126 AbsMin
24 AbsMax
-AbsMax -AbsMin
FP Overflow & Underflow
• Fixed-sized representation leads to limitations

Large positive exponent.


Unlike integer arithmetic, overflow 
imprecise result (), not inaccurate result

Round Round
to - Zero to +

Negative Expressible Negative Positive Expressible Positive


overflow negative values underflow underflow positive values overflow

Large negative exponent


Round to zero

25
Machine Epsilon
• Assuming x is represented as a floating point with
fraction saved in n bits, it may be proven that:
Max Relative chopping error= x  fl ( x)  Radix  n  eps
Chopping
x
• eps is known as
the machine epsilon = 1;
epsilon is the while (1 + 0.5*epsilon > 1)
smallest number epsilon = epsilon / 2;
such that: epsilon = epsilon * 2;
1 + eps > 1 Algorithm to compute machine
epsilon

Chopping error

Max RCE
26 x
Explain ON MY mingw !
// Machine Epsilon
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main()
{
double epsilon = 1.0f;
while ((1 + 0.5*epsilon) > 1 )
epsilon /=2;
printf("Machine epsilon=%e\n",epsilon);
double e=pow(2, -52);
printf("Machine epsilon=%e\n",e);}

Machine epsilon=2.220446e-016
27 Machine epsilon=2.220446e-016
A More Elegant (ignore if you
want)
void eps_float(){
typedef union {
unsigned int i;
float f;
} myFloat;
printf("----float: (bits) 1 sign + 8 exp +23 mantissa=%d
bytes\n",sizeof(float));
myFloat m;m.f=1.0;
m.i++;
float x=m.f-1.0;
printf("eps_float=%e\n",x);
printf("log2(eps)=%f\n\n",log2(x));
}
eps_float();
----float: (bits) 1 sign + 8 exp +23 mantissa=4 bytes
eps_float=1.192093e-007
log2(eps)=-23.000000
28
A More Elegant (ignore if you
want)
void eps_double(){
typedef union {
long long i;
double d;
} myDouble;
printf("----double: (bits) 1 sign + 11 exp +52 mantissa=%d
bytes\n",sizeof(double));
myDouble m;m.d=1.0;
m.i++;
double x=m.d-1.0;
printf("eps_double=%le\n",x);
printf("log2(eps)=%f\n\n",log2(x));
}
Eps_double();
------double: (bits) 1 sign + 11 exp +52 mantissa=8 bytes
eps_double=2.220446e-016
log2(eps)=-52.000000
29
A More Elegant (ignore if you
want)
void eps_extended(){
typedef union {
long long i;
long double dd;
} myDouble;
printf("----long double: (bits) 1 sign + 15 exp +63 mantissa=%d bytes???
(really 10)!!\n",sizeof(double));
myDouble m;m.dd=(long double)1.0;
m.i++;
long double x=m.dd-1.0;
printf("eps_exteded=%LE ????\n",x);
printf("!!! ");printBytesBigEndian((unsigned char *) &m.dd,10);
printf("log(eps)=%f\n",log2(x));
}
eps_extended();
----long double: (bits) 1 sign + 15 exp +63 mantissa=12 bytes??? (really 10)!!
eps_exteded=-0.000000E+000 ????
!!! 3f ff 80 00 00 00 00 00 00 01 : 00111111 11111111 10000000 00000000 00000000
30 00000000 00000000 00000000 00000000 00000001
log2(eps)=-63.000000
Danger of adding/subtracting a small
number with/from a large number
Assuming 4 decimal digits mantissa:
8001+0.3 will be saved as:
8.001*103+3.000*10-1.
To Sum (or substract) larger registers are used:
=8.001*103+0.00003*103.
=8.0013*103
----> (Rounding)
=8.001*103
Possible workarounds:
1) Sort the numbers by magnitude (if they have the
same signs) and add the numbers in increasing order
2) Reformulate the formula algebraically
Instead of 1000+999+998,...+1 - compute 1+2+3+...+1000
31
Associativity not necessarily hold for floating
point addition (or multiplication)
a  8.567  10 1 , b  1.325  103 , c  1.325  103
aa((bb  cc)  ?8.567 10 1
((aabb))  c  1?.000 100

The two answers are NOT the same!

Note: In this example, if we simply sort the numbers by


magnitude and add the number in increasing order, we
actually get worse answer!

Better approach is analyze the problem algebraically.

32
Subtraction of two close numbers
(Catastrophic Cancellation)
3.641 101
 3.640  101
0.001 101

The result will be normalized into 1.0 x 10-2

However, note that the zero's added to the end of the


mantissa are not significant.

Note: 1.0 x 10-2 implies the error is about ± 0.000999≈0.001


with a relative chopping error ≈ 0.001/ 0.001=100% !
33
Subtractive Cancellation – Subtraction of
two very close numbers

xT  5.764  1
2  10 4
yT  5.763  1
2  10  4
xT  yT  0.001  0.0001

The error bound is just as large as the estimation


of the result!

Subtraction of nearly equal numbers are major cause


of errors!

Avoid subtractive cancellation whenever possible.


34
Avoiding Subtractive Cancellations

Example 1: When x is large, compute


f ( x)  x  1  x
Is there a way to reduce the errors assuming that
we are using the same number of bits to represent
numbers?
Answer: One possible solution is via rationalization

f ( x)   x 1  x   x 1 
x 1 
x
x
1

x 1  x
35
Subtraction of nearly equal numbers
Example 2: Compute the roots of ax2 + bx + c =
0 using
 b  b 2  4ac
x when b  4ac
2

2a

Solve x2 – 26x + 1 = 0

26  262  4
x(1)
T   13  168
2
26  262  4
x(2)
T   13  168
2

36
Example 2 (continue)
Assume 5 decimal mantissa, 168  12.961

x A(1)  25.961 x A( 2 )  13.000  12.961  0.039


Since E x (1)  E x ( 2 )  0.0005
xT(1)  25.961, xT( 2 )  0.0385186

0.0005 0.0005
x (1)   1.9  10 ,  x(1) 
5
 1.3  102
25.961 0.0385186

x ( 2)   x(1) implies that one solution is more accurate


than the other one.

37
Example 2 (continue)
Alternatively, a better solution is
13  168
( 2)
xA  13  168  13  168 
13  168
1 1
   0.038519
13  168 25.961

with  x( 2 )   25.1961   x(1)


b b 2  4ac
i.e., instead of computing x
2a
4ac 2c
we use x  
2a(b  b  4ac )
2
b b 2  4ac
as the solution for the second root

38
Example 3 Explain: Which is Correct ?

#include <math.h>

float xx=0.000000001;
printf("%e\n",1-cos(xx));
printf("%e\n",sin(xx)*sin(xx)/(1+cos(xx)));

0.000000e+000
5.000000e-019

39
Abstract Data Type
• A set of data values and associated operations
that are precisely specified independent of any
particular implementation.
• Abstract Data type (ADT) is a type (or class) for
objects whose behavior is defined by a set of
value and a set of operations.
• A kind of data abstraction where a type's internal
form is hidden behind a set of access functions.
Values of the type are created and inspected only
by calls to the access functions. This allows the
implementation of the type to be changed without
requiring any changes outside the module in which
it is defined.

40
Lists
• A list or sequence (ordered, indexed) is an
abstract data type that represents a countable
number of ordered values, where the same value
may occur more than once.
• Lists may be homogeneous/heterogeneous
• Basic operations:
• Create, Delete, test full/empty
• Get head, tail, hence get index (John McCarthy,1958)
• Head([1,2,3])=1, tail([1,2,3]=[2,3]- always list
that may be [].

41
Storage Model for Lists
• Contiguous: if homogenous -> arrays
• Non contiguous: if homogeneous or heterogeneous
• A string is an array of list (variable length)
• Either delimited: char * -> ’a’,’b’,’c’,’d’,’\0’
• OR Length included string * -> 4,’a’,’b’,’c’,’d’

42
The Array Abstract Data Type
• Array is a container which holds a fixed number of
logically contiguous Homogenous items. Terms
associated:
• Element − Each item stored in an array is called an element.
• Index − Each location of an element in an array has a numerical
index, which is used to identify the element.
• The basic function on arrays is get address of
element[index]
• If SA is the start address of <type> array[] of N elements, the
address of array[i] is:
SA+size(type>)*i
in C syntax SA =array= &array[0]
and SA+size(type)*i = array+i
So: *(array+i)=array[i]

43
Addressing ith element in N
dimensional arrays
• <type> array[n]: only 2 values are maintained:
pointer to start (named array) and sizeof(type)
• Address(array[i])=S+ sizeof(type)*i [= array+i in C)]
• Address a[i][j] in a[N][M] is S +s(i*M+j)
• in C a+i*m+j
• FOR: array A[N1 ][N2 ]…[Nd ] with dimensions N1 *N2 *
N3 …*Nd ; Nk (k=1...d); index of A[n1] [n2]…[nd] is
(row-major)

(column-major)
Rectangular vs. Jagged matrices
• int ar[10][5] 10*5*4 bytes

address of ar[0][0]=ar

ar+(2*5+3)*4

• address of ar[2][3] = ar + 2*5 + 3


10 addresses (40 bytes)

• int ar *[10] ar[0][0], ar[0][1],…

ar[2][0], ar[2][1],…

address of ar[2][3] = ar[2] + 3

45
Explain
int a[][3]={{1,2,3},{4,5,6},{7,8,9}};
cout << a[0][4] <<endl;
int *pt;
pt=(int *)a; 5
cout << pt <<" " << a << endl; 0x28fedc 0x28fedc
cout << *pt << " " << *a << endl;
1 0x28fedc
1
cout << *(*a) << endl;
8
cout << *(pt+2*3+1) << endl;
8
char *ptc;int *pti;
4273648
ptc=(char *)pt;
pti=(int *) (ptc+(2*3+1)*sizeof(int) );
cout << *(pti) << endl;
int r1[]={1,2,3,4};
int r2[]={5,6};
int *ar[]={r1,r2};
cout << ar[0][4] << endl;

46
Explain Code
short x[3][3]={{1,2,0},{3,4,0},{5,6,0}};
short *y[3];
y[0]=malloc(3*sizeof(short));
y[1]=malloc(2*sizeof(short));
y[2]=malloc(5*sizeof(short));
y[2][4]=777;
printf("%d\n",y[2][4]);
777
printf("%d %d",sizeof(x),sizeof(y));
47
18 12
Explain
int a[][3]={{1,2,3},{4,5,6},{7,8,9}};
cout << a[0][4] <<endl; 5
int *pt; 0x28fedc 0x28fedc
1 0x28fedc
pt=(int *)a;
1
cout << pt <<" " << a << endl; 8
cout << *pt << " " << *a << endl; 8
cout << *(*a) << endl; 4273648
cout << *(pt+2*3+1) << endl;
char *ptc;int *pti;
ptc=(char *)pt;
pti=(int *) (ptc+(2*3+1)*sizeof(int) );
cout << *(pti) << endl;
int r1[]={1,2,3,4};
int r2[]={5,6};
int *ar[]={r1,r2};//Jagged
cout << ar[0][4] << endl;
48
Explain Code
short x[3][3]={{1,2,0},{3,4,0},{5,6,0}};
short *y[3];
y[0]=malloc(3*sizeof(short));// Jagged array
y[1]=malloc(2*sizeof(short));
y[2]=malloc(5*sizeof(short));
y[2][4]=777;
printf("%d\n",y[2][4]);
777
printf("%d %d",sizeof(x),sizeof(y));
49
18 12
Thanks

50

You might also like