Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1


Carnegie Mellon

Bits, Bytes and Integers – Part 1


15-213/18-213/14-513/15-513: Introduction to Computer Systems
2nd Lecture, May 26, 2021

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2


Carnegie Mellon

Announcements
 Linux Boot Camp Thursday, 12:20 EDT

 Lab 0 will be available via course web page and Autolab.


▪ Release estimated at 5/26, 1pm
▪ Due Fri June 3 (or 4), 11:59pm
▪ No grace days
▪ No late submissions
▪ Just do it!

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3


Carnegie Mellon

Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4


Carnegie Mellon

Everything is bits
 Each bit is 0 or 1
 By encoding/interpreting sets of bits in various ways
▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
 Why bits? Electronic Implementation

An Amazing & Successful Abstraction.

(which we won’t dig into in 213)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5


Carnegie Mellon

Everything is bits
 Each bit is 0 or 1
 By encoding/interpreting sets of bits in various ways
▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
 Why bits? Electronic Implementation
▪ Easy to store with bistable elements
▪ Reliably transmitted on noisy and inaccurate wires
0 1 0

1.1V
0.9V

0.2V
0.0V
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 6
Carnegie Mellon

For example, can count in binary


 Base 2 Number Representation
▪ Represent 1521310 as 111011011011012
▪ Represent 1.2010 as 1.0011001100110011[0011]…2
▪ Represent 1.5213 X 104 as 1.11011011011012 X 213

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7


Carnegie Mellon

Encoding Byte Values


0 0 0000
 Byte = 8 bits 1 1 0001
▪ Binary 000000002 to 111111112 2 2 0010
3 3 0011
▪ Decimal: 010 to 25510 4 4 0100
▪ Hexadecimal 0016 to FF16 5 5 0101
6 6 0110
▪ Base 16 number representation 7 7 0111
8 8 1000
▪ Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 9 9 1001
▪ Write FA1D37B16 in C as A 10 1010
B 11 1011
– 0xFA1D37B C 12 1100
– 0xfa1d37b D 13 1101
E 14 1110
F 15 1111

15213: 0011 1011 0110 1101


3 B 6 D
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8
Carnegie Mellon

Example Data Representations


C Data Type Typical 32-bit Typical 64-bit x86-64

char 1 1 1
short 2 2 2
int 4 4 4
long 4 8 8
float 4 4 4
double 8 8 8

pointer 4 8 8

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9


Carnegie Mellon

Example Data Representations


C Data Type Typical 32-bit Typical 64-bit x86-64

char 1 1 1
short 2 2 2
int 4 4 4
long 4 8 8
float 4 4 4
double 8 8 8

pointer 4 8 8

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10


Carnegie Mellon

Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11


Carnegie Mellon

Boolean Algebra
 Developed by George Boole in 19th Century
▪ Algebraic representation of logic
▪ Encode “True” as 1 and “False” as 0
And Or
◼ A&B = 1 when both A=1 and B=1 ◼ A|B = 1 when either A=1 or B=1

Not Exclusive-Or (Xor)


◼ ~A = 1 when A=0 ◼ A^B = 1 when either A=1 or B=1, but not both

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12


Carnegie Mellon

General Boolean Algebras


 Operate on Bit Vectors
▪ Operations applied bitwise
01101001 01101001 01101001
& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001
01000001 01111101
01111101 00111100
00111100 10101010
10101010
 All of the Properties of Boolean Algebra Apply

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13


Carnegie Mellon

Example: Representing & Manipulating Sets


 Representation
▪ Width w bit vector represents subsets of {0, …, w–1}
▪ aj = 1 if j ∈ A

▪ 01101001 { 0, 3, 5, 6 }
▪ 76543210

▪ 01010101 { 0, 2, 4, 6 }
▪ 76543210
 Operations
▪ & Intersection 01000001 { 0, 6 }
▪ | Union 01111101 { 0, 2, 3, 4, 5, 6 }
▪ ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
▪ ~ Complement 10101010 { 1, 3, 5, 7 }
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14
Carnegie Mellon

Bit-Level Operations in C
 Operations &, |, ~, ^ Available in C
0 0 0000
▪ Apply to any “integral” data type 1 1 0001
▪ long, int, short, char, unsigned 2 2 0010
3 3 0011
▪ View arguments as bit vectors 4 4 0100
5 5 0101
▪ Arguments applied bit-wise 6 6 0110
 Examples (Char data type) 7 7 0111
8 8 1000
▪ ~0x41 → 0xBE 9 9 1001
~010000012 → 101111102
▪ A 10 1010
B 11 1011
▪ ~0x00 → 0xFF C 12 1100
▪ ~000000002 → 111111112 D 13 1101
E 14 1110
▪ 0x69 & 0x55 → 0x41 F 15 1111
▪ 011010012 & 010101012 → 010000012
▪ 0x69 | 0x55 → 0x7D
▪ 011010012 | 010101012 → 011111012

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15


Carnegie Mellon

Bit-Level Operations in C
 Operations &, |, ~, ^ Available in C
0 0 0000
▪ Apply to any “integral” data type 1 1 0001
▪ long, int, short, char, unsigned 2 2 0010
3 3 0011
▪ View arguments as bit vectors 4 4 0100
5 5 0101
▪ Arguments applied bit-wise 6 6 0110
 Examples (Char data type) 7 7 0111
8 8 1000
▪ ~0x41 → 0xBE 9 9 1001
~0100 000122→
~01000001
▪ →10111110
1011 11102 2 A 10 1010
B 11 1011
▪ ~0x00 → 0xFF C 12 1100
▪ ~0000
~00000000
000022→→11111111
1111 11112 2 D 13 1101
E 14 1110
▪ 0x69 & 0x55 → 0x41 F 15 1111
▪ 0110
01101001
100122&&01010101
0101 01012 2→→01000001
0100 0001
2 2

▪ 0x69 | 0x55 → 0x7D


▪ 0110
01101001
100122||01010101
0101 01012 2→→01111101
0111 1101
2 2

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16


Carnegie Mellon

Contrast: Logic Operations in C


 Contrast to Bit-Level Operators
▪ Logic Operations: &&, ||, !
▪ View 0 as “False”
▪ Anything nonzero as “True”
▪ Always return 0 or 1
▪ Early termination
 Examples (char data type)
Watch out for && vs. & (and || vs. |)…
▪ !0x41 → 0x00 Super common C programming pitfall!
▪ !0x00 → 0x01
▪ !!0x41→ 0x01

▪ 0x69 && 0x55 → 0x01


▪ 0x69 || 0x55 → 0x01
▪ p && *p (avoids null pointer access)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17


Carnegie Mellon

Shift Operations
 Left Shift: x << y Argument x 01100010
▪ Shift bit-vector x left y positions << 3 00010000
– Throw away extra bits on left
Log. >> 2 00011000
▪ Fill with 0’s on right
Arith. >> 2 00011000
 Right Shift: x >> y
▪ Shift bit-vector x right y positions
Throw away extra bits on right
▪ Argument x 10100010

▪ Logical shift << 3 00010000


▪ Fill with 0’s on left Log. >> 2 00101000
▪ Arithmetic shift
Arith. >> 2 11101000
▪ Replicate most significant bit on left

 Undefined Behavior
▪ Shift amount < 0 or ≥ word size
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18
Carnegie Mellon

Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings
 Summary

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19


Carnegie Mellon

Encoding Integers
Unsigned Two’s Complement
w−1 w−2
B2U(X ) =  xi 2 i
B2T(X) = − xw−1 2 w−1
+  xi 2 i
i=0 i=0

short int x = 15213;


short int y = -15213; Sign Bit
 C does not mandate using two’s complement
▪ But, most machines do, and we will assume so
 C short 2 bytes long
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
y -15213 C4 93 11000100 10010011

 Sign Bit
▪ For 2’s complement, most significant bit indicates sign
▪ 0 for nonnegative
▪ 1 for negative 20
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon

Two-complement: Simple Example

-16 8 4 2 1
10 = 0 1 0 1 0 8+2 = 10

-16 8 4 2 1
-10 = 1 0 1 1 0 -16+4+2 = -10

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21


Carnegie Mellon

Two-complement Encoding Example (Cont.)


x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
Weight 15213 -15213
1 1 1 1 1
2 0 0 1 2
4 1 4 0 0
8 1 8 0 0
16 0 0 1 16
32 1 32 0 0
64 1 64 0 0
128 0 0 1 128
256 1 256 0 0
512 1 512 0 0
1024 0 0 1 1024
2048 1 2048 0 0
4096 1 4096 0 0
8192 1 8192 0 0
16384 0 0 1 16384
-32768 0 0 1 -32768
Sum 15213 -15213
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22
Carnegie Mellon

Numeric Ranges
 Unsigned Values
 Two’s Complement Values
▪ UMin = 0
▪ TMin = –2w–1
000…0
100…0
▪ UMax = 2w –1
▪ TMax = 2w–1 – 1
111…1
011…1
▪ Minus 1
111…1

Values for W = 16
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 -1 FF FF 11111111 11111111
0 0 00 00 00000000 00000000

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23


Carnegie Mellon

Values for Different Word Sizes


W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

 Observations  C Programming
▪ |TMin | = TMax + 1 ▪ #include <limits.h>
▪ Asymmetric range ▪ Declares constants, e.g.,
▪ UMax = 2 * TMax + 1 ▪ ULONG_MAX
▪ Question: abs(TMin)? ▪ LONG_MAX
▪ LONG_MIN
▪ Values platform specific

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24


Carnegie Mellon

Unsigned & Signed Numeric Values


X B2U(X) B2T(X)  Equivalence
0000 0 0 ▪ Same encodings for nonnegative
0001 1 1 values
0010 2 2
 Uniqueness
0011 3 3
0100 4 4 ▪ Every bit pattern represents
0101 5 5 unique integer value
0110 6 6 ▪ Each representable integer has
0111 7 7 unique bit encoding
1000 8 –8   Can Invert Mappings
1001 9 –7
1010 10 –6
▪ U2B(x) = B2U-1(x)
1011 11 –5 ▪Bit pattern for unsigned
1100 12 –4 integer
1101 13 –3 ▪ T2B(x) = B2T-1(x)
1110 14 –2 ▪ Bit pattern for two’s comp
1111 15 –1 integer
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25
Carnegie Mellon

Quiz Time!

Check out:

https://canvas.cmu.edu/courses/23122/quizzes/61
559

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26


Carnegie Mellon

Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27


Carnegie Mellon

Mapping Between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X
Maintain Same Bit Pattern

Unsigned U2T Two’s Complement


ux U2B B2T x
X
Maintain Same Bit Pattern

 Mappings between unsigned and two’s complement numbers:


Keep bit representations and reinterpret
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 28
Carnegie Mellon

Mapping Signed  Unsigned


Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
T2U
0110 6 6
0111 7 U2T 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 29
Carnegie Mellon

Mapping Signed  Unsigned


Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4
= 4
0101 5 5
0110 6 6
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5
+/- 16 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 30
Carnegie Mellon

Relation between Signed & Unsigned

Two’s Complement Unsigned


T2U
x T2B B2U ux
X
Maintain Same Bit Pattern

w–1 0
ux + + + ••• +++
x - ++ ••• +++

Large negative weight


becomes
Large positive weight
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 31
Carnegie Mellon

Conversion Visualized
 2’s Comp. → Unsigned
▪ Ordering Inversion UMax
▪ Negative → Big Positive
UMax – 1

TMax + 1 Unsigned
TMax TMax Range

2’s Complement
0 0
Range
–1
–2

TMin
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 32
Carnegie Mellon

Signed vs. Unsigned in C


 Constants
▪ By default are considered to be signed integers
▪ Unsigned if have “U” as suffix
0U, 4294967259U
 Casting
▪ Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;

▪ Implicit casting also occurs via assignments and procedure calls


tx = ux; int fun(unsigned u);
uy = ty; uy = fun(tx);

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 33


Carnegie Mellon

Casting Surprises
 Expression Evaluation
▪ If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
▪ Including comparison operations <, >, ==, <=, >=
▪ Examples for W = 32: TMIN = -2,147,483,648 , TMAX = 2,147,483,647
 Constant1 Constant2 Relation Evaluation
0 0 0U
0U == unsigned
-1 -1 00 < signed
-1 -1 0U
0U > unsigned
2147483647
2147483647 -2147483647-1
-2147483648 > signed
2147483647U
2147483647U -2147483647-1
-2147483648 < unsigned
-1 -1 -2
-2 > signed
(unsigned)-1
(unsigned) -1 -2
-2 > unsigned
2147483647
2147483647 2147483648U
2147483648U < unsigned
2147483647
2147483647 (int)2147483648U
(int) 2147483648U > signed
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 34
Carnegie Mellon

Summary
Casting Signed ↔ Unsigned: Basic Rules
 Bit pattern is maintained
 But reinterpreted
 Can have unexpected effects: adding or subtracting 2w

 Expression containing signed and unsigned int


▪ int is cast to unsigned!!

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36


Carnegie Mellon

Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 37


Carnegie Mellon

Sign Extension
 Task:
▪ Given w-bit signed integer x
▪ Convert it to w+k-bit integer with same value
 Rule:
▪ Make k copies of sign bit:
▪ X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0

k copies of MSB w
X •••

•••

X ••• •••
k
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
w 38
Carnegie Mellon

Sign Extension: Simple Example

Positive number Negative number

-16 8 4 2 1 -16 8 4 2 1

10 = 0 1 0 1 0 -10 = 1 0 1 1 0

-32 16 8 4 2 1 -32 16 8 4 2 1

10 = 0 0 1 0 1 0 -10 = 1 1 0 1 1 0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 39


Carnegie Mellon

Larger Sign Extension Example


short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;

Decimal Hex Binary


x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

 Converting from smaller to larger integer data type


 C automatically performs sign extension

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 40


Carnegie Mellon

Truncation
 Task:
▪ Given k+w-bit signed or unsigned integer X
▪ Convert it to w-bit integer X’ with same value for “small enough” X
 Rule:
▪ Drop top k bits:
▪ X  = xw–1 , xw–2 ,…, x0

k w
X ••• •••

•••

X •••
w 41
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon

Truncation: Simple Example


No sign change Sign change
-16 8 4 2 1 -16 8 4 2 1

2 = 0 0 0 1 0 10 = 0 1 0 1 0

-8 4 2 1 -8 4 2 1

2 = 0 0 1 0 -6 = 1 0 1 0
2 mod 16 = 2 10 mod 16 = 10U mod 16 = 10U = -6

-16 8 4 2 1 -16 8 4 2 1

-6 = 1 1 0 1 0 -10 = 1 0 1 1 0

-8 4 2 1 -8 4 2 1

-6 = 1 0 1 0 6 = 0 1 1 0
-6 mod 16 = 26U mod 16 = 10U = -6 -10 mod 16 = 22U mod 16 = 6U = 6

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 42


Carnegie Mellon

Summary:
Expanding, Truncating: Basic Rules
 Expanding (e.g., short int to int)
▪ Unsigned: zeros added
▪ Signed: sign extension
▪ Both yield expected result

 Truncating (e.g., unsigned to unsigned short)


▪ Unsigned/signed: bits are truncated
▪ Result reinterpreted
▪ Unsigned: mod operation
▪ Signed: similar to mod
▪ For small (in magnitude) numbers yields expected behavior

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 43


Carnegie Mellon

Summary of Today: Bits, Bytes, and Integers


 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
 Representations in memory, pointers, strings
 Summary

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 44

You might also like