Professional Documents
Culture Documents
Computer Architecture: Nguyễn Trí Thành
Computer Architecture: Nguyễn Trí Thành
12/10/2010 1
More on Arithmetic for
Computers
12/10/2010 2
Arithmetic for Computers
MIPS instructions for Integers
Operations on floating-point real numbers
Addition and subtraction
Multiplication and division
Dealing with overflow
12/10/2010 3
MIPS Multiplication
Two 32-bit registers for product
HI: most-significant 32 bits
LO: least-significant 32-bits
Instructions
mult rs, rt / multu rs, rt
64-bit product in HI/LO
mfhi rd / mflo rd
Move from HI/LO to rd
Initially dividend
12/10/2010 6
Optimized Divider
12/10/2010 8
MIPS Division
Use HI/LO registers for result
HI: 32-bit remainder
LO: 32-bit quotient
Instructions
div rs, rt / divu rs, rt
No overflow or divide-by-0 checking
Software must perform checks if required
Use mfhi, mflo to access result
12/10/2010 9
Real numbers
Decimal real numbers
13.234 = 1*101+3*100+2*10-1+3*10-2+4*10-3
Binary real numbers
101.111 = 1*22+1*20+1*2-1+1*2-2+1*2-3=5.875
Decimal to binary 0.68*2=1.36 1
4.68=100.? 0.36*2=0.72 0
100.1010111 0.72*2=1.44 1
Just approximately 0.44*2=0.88 0
0.88*2=1.76 1
4.6796875
0.76*2=1.52 1
12/10/2010
0.52*2=1.04 1 10
…
Fixed point real numbers
Representation
A number of bits is used to represent the integral
part
The rest represents the fraction value
The hardware is less costly
The precision is not high
Suitable for some special-purpose embedded
processors
12/10/2010 11
Floating Point
Representation for non-integral numbers
Including very small and very large numbers
Like scientific notation
–2.34 × 1056 normalized
+0.002 × 10–4
not normalized
+987.02 × 109
In binary
±1.xxxxxxx2 × 2yyyy
Types float and double in C
12/10/2010 12
Floating Point Standard
Defined by IEEE Std 754-1985
Developed in response to divergence of
representations
Portability issues for scientific code
Now almost universally adopted
Two representations
Single precision (32-bit)
Double precision (64-bit)
12/10/2010 13
IEEE Floating-Point Format
single: 8 bits single: 23 bits
double: 11 bits double: 52 bits
S Exponent Fraction
12/10/2010 17
Floating-Point Example
Represent –0.75
–0.75 = (–1)1 × 1.12 × 2–1
S=1
Fraction = 1000…002
Exponent = –1 + Bias
Single: –1 + 127 = 126 = 011111102
Double: –1 + 1023 = 1022 = 011111111102
Single: 1011111101000…00
Double: 1011111111101000…00
12/10/2010 18
Floating-Point Example
What number is represented by the single-
precision float
11000000101000…00
S=1
Fraction = 01000…002
Fxponent = 100000012 = 129
x = (–1)1 × (1 + 012) × 2(129 – 127)
= (–1) × 1.25 × 22
= –5.0
12/10/2010 19
Denormal Numbers
Exponent = 000...0 ⇒ hidden bit is 0
S −Bias
x = ( −1) × (0 + Fraction) × 2
Smaller than normal numbers
allow for gradual underflow, with diminishing
precision
12/10/2010 25
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
12/10/2010 26
Floating-Point Multiplication
Consider a 4-digit decimal example
1.110 × 1010 × 9.200 × 10–5
1. Add exponents
For biased exponents, subtract bias from sum
New exponent = 10 + –5 = 5
2. Multiply significands
1.110 × 9.200 = 10.212 ⇒ 10.212 × 105
3. Normalize result & check for over/underflow
1.0212 × 106
4. Round and renormalize if necessary
1.021 × 106
5. Determine sign of result from signs of operands
+1.021 × 106
12/10/2010 27
Floating-Point Multiplication
Now consider a 4-digit binary example
1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
1. Add exponents
Unbiased: –1 + –2 = –3
Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
2. Multiply significands
1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3
3. Normalize result & check for over/underflow
1.1102 × 2–3 (no change) with no over/underflow
4. Round and renormalize if necessary
1.1102 × 2–3 (no change)
5. Determine sign: +ve × –ve ⇒ –ve
–1.1102 × 2–3 = –0.21875
12/10/2010 28
FP Arithmetic Hardware
FP multiplier is of similar complexity to FP
adder
But uses a multiplier for significands instead of an
adder
FP arithmetic hardware usually does
Addition, subtraction, multiplication, division,
reciprocal, square-root
FP ↔ integer conversion
Operations usually takes several cycles
Can be pipelined
12/10/2010 29
FP Instructions in MIPS
FP hardware is coprocessor 1
Adjunct processor that extends the ISA
Separate FP registers
32 single-precision: $f0, $f1, … $f31
Paired for double-precision: $f0/$f1, $f2/$f3, …
Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s
Double-precision arithmetic
add.d, sub.d, mul.d, div.d
e.g., mul.d $f4, $f4, $f6
12/10/2010 32
FP Example: Array Multiplication
X=X+Y×Z
All 32 × 32 matrices, 64-bit double-precision elements
C code:
void mm (double x[][],
double y[][], double z[][]) {
int i, j, k;
for (i = 0; i! = 32; i = i + 1)
for (j = 0; j! = 32; j = j + 1)
for (k = 0; k! = 32; k = k + 1)
x[i][j] = x[i][j]
+ y[i][k] * z[k][j];
}
Addresses of x, y, z in $a0, $a1, $a2, and
i, j, k in $s0, $s1, $s2
12/10/2010 33
FP Example: Array Multiplication
MIPS code:
12/10/2010 35
Accurate Arithmetic
IEEE Std 754 specifies additional rounding control
Extra bits of precision (guard, round, sticky)
Choice of rounding modes
Allows programmer to fine-tune numerical behavior of a
computation
Not all FP units implement all options
Most programming languages and FP libraries just use
defaults
Trade-off between hardware complexity,
performance, and market requirements
12/10/2010 36
Interpretation of Data
Bits have no inherent meaning
Interpretation depends on the instructions applied
Computer representations of numbers
Finite range and precision
Need to account for this in programs
12/10/2010 37
Associativity
Parallel programs may interleave operations
in unexpected orders
Assumptions of associativity may fail
(x+y)+z x+(y+z)
x -1.50E+38 -1.50E+38
y 1.50E+38 0.00E+00
z 1.0 1.0 1.50E+38
1.00E+00 0.00E+00
12/10/2010 39
x86 FP Instructions
Data transfer Arithmetic Compare Transcendental
FILD mem/ST(i) FIADDP mem/ST(i) FICOMP FPATAN
FISTP mem/ST(i) FISUBRP mem/ST(i) FIUCOMP F2XMI
FLDPI FIMULP mem/ST(i) FSTSW AX/mem FCOS
FLD1 FIDIVRP mem/ST(i) FPTAN
FLDZ FSQRT FPREM
FABS FPSIN
FRNDINT FYL2X
Optional variations
I: integer operand
P: pop operand from stack
R: reverse operand order
But not all combinations allowed
12/10/2010 40
Streaming SIMD Extension 2
(SSE2)
Adds 4 × 128-bit registers
Extended to 8 registers in AMD64/EM64T
Can be used for multiple FP operands
2 × 64-bit double precision
4 × 32-bit double precision
Instructions operate on them simultaneously
Single-Instruction Multiple-Data
SSE3 (version 3) is now available
12/10/2010 41
SSE3 introduction
12/10/2010 42
SSE3 instructions
12/10/2010 43
SSE3 instructions (cont’d)
len: .double 23.45
result: .double 0.0
arr: .double 3.1,2.3,3.4,4.5,5.6
...
movsd len,%xmm0
movsd %xmm0,result
movsd arr(,1,8),%xmm1
12/10/2010 44
SSE3 instructions (cont’d)
12/10/2010 45
SSE3 instructions (cont’d)
len: .double 23.45
result: .double 0.0
arr: .double 3.1,2.3,3.4,4.5,5.6
...
movsd len,%xmm0
movsd arr(,1,8),%xmm1
add %xmm1,%xmm0
movsd %xmm0,result
12/10/2010 46
Exercises
Write a program to add two double numbers
and print the result on screen
Write a program to multiply two double
numbers and print the result on screen
Write a program to print the maximum
number of the two double numbers
Write a program to sum the elements of a
double array and print the result on screen
12/10/2010 47
Exercises (cont’d)
Write a program to solve the equation ax+b=0
Write a program to solve the equation ax2+bx+c=0
Write a program to print the first of n numbers of a
geometric sequence with a given value of a and r
Write a program to print the first of n number in an
arithmetic sequence with a given value of d and u
12/10/2010 48
Numeric types and conversions
12/10/2010 49
Linux 64bit C data model
LP64 16 32 64 64
12/10/2010 50
Numeric types and conversions
(cont’d)
Celcius to fahrenheit
double cel2fahr(double temp)
{
return 1.8 * temp + 32.0;
}
convert the above function into an assembly
procedure
12/10/2010 51
Exercises
void proc(long a1, long *a1p,int a2, int *a2p,
short a3, short *a3p,char a4, char *a4p)
{
*a1p += a1;
*a2p += a2;
*a3p += a3;
*a4p += a4;
}
Convert the above function into an assembly
procedure
12/10/2010 52
Exercises (cont’d)
double fcvt(int i, float *fp, double *dp, long *lp)
{
float f = *fp; double d = *dp; long l = *lp;
*lp = (long) d;
*fp = (float) i;
*dp = (double) l;
return (double) f;
}
Convert the above function into an assembly
procedure
12/10/2010 53
Exercises (cont’d)
double funct(double a,
float x, double b, int i)
{
return a*x - b/i;
}
Convert the above function into an
assembly procedure
12/10/2010 54
End of chapter
Happy coding!
Any questions?
12/10/2010 55