02 Bits Ints Part1

Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1

Carnegie Mellon
Bits, Bytes and Integers – Part 1

15-213/18-213/14-513/15-513: Introduction to Computer Systems
2nd Lecture, May 26, 2021

Carnegie Mellon
Announcements
 Linux Boot Camp Thursday, 12:20 EDT
 Lab 0 will be available via course web page and Autolab.

▪ Release estimated at 5/26, 1pm
▪ Due Fri June 3 (or 4), 11:59pm
▪ No grace days
▪ No late submissions
▪ Just do it!

Carnegie Mellon
Today: Bits, Bytes, and Integers

 Representing information as bits
 Bit-level manipulations
 Integers
▪ Representation: unsigned and signed
▪ Conversion, casting
▪ Expanding, truncating
▪ Addition, negation, multiplication, shifting
▪ Summary
 Representations in memory, pointers, strings

Carnegie Mellon
Everything is bits
 Each bit is 0 or 1
 By encoding/interpreting sets of bits in various ways
▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
 Why bits? Electronic Implementation
An Amazing & Successful Abstraction.
(which we won’t dig into in 213)

Carnegie Mellon
Everything is bits
 Each bit is 0 or 1
 By encoding/interpreting sets of bits in various ways
▪ Computers determine what to do (instructions)
▪ … and represent and manipulate numbers, sets, strings, etc…
 Why bits? Electronic Implementation
▪ Easy to store with bistable elements
▪ Reliably transmitted on noisy and inaccurate wires
0 1 0
1.1V
0.9V
0.2V
0.0V
Carnegie Mellon
For example, can count in binary

 Base 2 Number Representation
▪ Represent 1521310 as 111011011011012
▪ Represent 1.2010 as 1.0011001100110011[0011]…2
▪ Represent 1.5213 X 104 as 1.11011011011012 X 213

Carnegie Mellon
Encoding Byte Values

0 0 0000
 Byte = 8 bits 1 1 0001
▪ Binary 000000002 to 111111112 2 2 0010
3 3 0011
▪ Decimal: 010 to 25510 4 4 0100
▪ Hexadecimal 0016 to FF16 5 5 0101
6 6 0110
▪ Base 16 number representation 7 7 0111
8 8 1000
▪ Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 9 9 1001
▪ Write FA1D37B16 in C as A 10 1010
B 11 1011
– 0xFA1D37B C 12 1100
– 0xfa1d37b D 13 1101
E 14 1110
F 15 1111
15213: 0011 1011 0110 1101

3 B 6 D
Carnegie Mellon
Example Data Representations

C Data Type Typical 32-bit Typical 64-bit x86-64
char 1 1 1
short 2 2 2
int 4 4 4
long 4 8 8
float 4 4 4
double 8 8 8
pointer 4 8 8

Carnegie Mellon
Example Data Representations

C Data Type Typical 32-bit Typical 64-bit x86-64
char 1 1 1
short 2 2 2
int 4 4 4
long 4 8 8
float 4 4 4
double 8 8 8
pointer 4 8 8

Carnegie Mellon

 Integers
▪ Summary

Carnegie Mellon
Boolean Algebra
 Developed by George Boole in 19th Century
▪ Algebraic representation of logic
▪ Encode “True” as 1 and “False” as 0
And Or
◼ A&B = 1 when both A=1 and B=1 ◼ A|B = 1 when either A=1 or B=1
Not Exclusive-Or (Xor)

◼ ~A = 1 when A=0 ◼ A^B = 1 when either A=1 or B=1, but not both

Carnegie Mellon
General Boolean Algebras

 Operate on Bit Vectors
▪ Operations applied bitwise
01101001 01101001 01101001
& 01010101 | 01010101 ^ 01010101 ~ 01010101
01000001
01000001 01111101
01111101 00111100
00111100 10101010
10101010
 All of the Properties of Boolean Algebra Apply

Carnegie Mellon
Example: Representing & Manipulating Sets

 Representation
▪ Width w bit vector represents subsets of {0, …, w–1}
▪ aj = 1 if j ∈ A
▪ 01101001 { 0, 3, 5, 6 }
▪ 76543210
▪ 01010101 { 0, 2, 4, 6 }
▪ 76543210
 Operations
▪ & Intersection 01000001 { 0, 6 }
▪ | Union 01111101 { 0, 2, 3, 4, 5, 6 }
▪ ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
▪ ~ Complement 10101010 { 1, 3, 5, 7 }
Carnegie Mellon
Bit-Level Operations in C
 Operations &, |, ~, ^ Available in C
0 0 0000
▪ Apply to any “integral” data type 1 1 0001
▪ long, int, short, char, unsigned 2 2 0010
3 3 0011
▪ View arguments as bit vectors 4 4 0100
5 5 0101
▪ Arguments applied bit-wise 6 6 0110
 Examples (Char data type) 7 7 0111
8 8 1000
▪ ~0x41 → 0xBE 9 9 1001
~010000012 → 101111102
▪ A 10 1010
B 11 1011
▪ ~0x00 → 0xFF C 12 1100
▪ ~000000002 → 111111112 D 13 1101
E 14 1110
▪ 0x69 & 0x55 → 0x41 F 15 1111
▪ 011010012 & 010101012 → 010000012
▪ 0x69 | 0x55 → 0x7D
▪ 011010012 | 010101012 → 011111012

Carnegie Mellon
Bit-Level Operations in C
 Operations &, |, ~, ^ Available in C
0 0 0000
▪ Apply to any “integral” data type 1 1 0001
▪ long, int, short, char, unsigned 2 2 0010
3 3 0011
▪ View arguments as bit vectors 4 4 0100
5 5 0101
▪ Arguments applied bit-wise 6 6 0110
 Examples (Char data type) 7 7 0111
8 8 1000
▪ ~0x41 → 0xBE 9 9 1001
~0100 000122→
~01000001
▪ →10111110
1011 11102 2 A 10 1010
B 11 1011
▪ ~0x00 → 0xFF C 12 1100
▪ ~0000
~00000000
000022→→11111111
1111 11112 2 D 13 1101
E 14 1110
▪ 0x69 & 0x55 → 0x41 F 15 1111
▪ 0110
01101001
100122&&01010101
0101 01012 2→→01000001
0100 0001
2 2
▪ 0x69 | 0x55 → 0x7D

▪ 0110
01101001
100122||01010101
0101 01012 2→→01111101
0111 1101
2 2

Carnegie Mellon
Contrast: Logic Operations in C

 Contrast to Bit-Level Operators
▪ Logic Operations: &&, ||, !
▪ View 0 as “False”
▪ Anything nonzero as “True”
▪ Always return 0 or 1
▪ Early termination
 Examples (char data type)
Watch out for && vs. & (and || vs. |)…
▪ !0x41 → 0x00 Super common C programming pitfall!
▪ !0x00 → 0x01
▪ !!0x41→ 0x01
▪ 0x69 && 0x55 → 0x01

▪ 0x69 || 0x55 → 0x01
▪ p && *p (avoids null pointer access)

Carnegie Mellon
Shift Operations
 Left Shift: x << y Argument x 01100010
▪ Shift bit-vector x left y positions << 3 00010000
– Throw away extra bits on left
Log. >> 2 00011000
▪ Fill with 0’s on right
Arith. >> 2 00011000
 Right Shift: x >> y
▪ Shift bit-vector x right y positions
Throw away extra bits on right
▪ Argument x 10100010
▪ Logical shift << 3 00010000

▪ Fill with 0’s on left Log. >> 2 00101000
▪ Arithmetic shift
Arith. >> 2 11101000
▪ Replicate most significant bit on left
 Undefined Behavior
▪ Shift amount < 0 or ≥ word size
Carnegie Mellon

 Integers
▪ Summary
 Summary

Carnegie Mellon
Encoding Integers
Unsigned Two’s Complement
w−1 w−2
B2U(X ) =  xi 2 i
B2T(X) = − xw−1 2 w−1
+  xi 2 i
i=0 i=0
short int x = 15213;

short int y = -15213; Sign Bit
 C does not mandate using two’s complement
▪ But, most machines do, and we will assume so
 C short 2 bytes long
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
y -15213 C4 93 11000100 10010011
 Sign Bit
▪ For 2’s complement, most significant bit indicates sign
▪ 0 for nonnegative
▪ 1 for negative 20
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
Two-complement: Simple Example
-16 8 4 2 1
10 = 0 1 0 1 0 8+2 = 10
-16 8 4 2 1
-10 = 1 0 1 1 0 -16+4+2 = -10

Carnegie Mellon
Two-complement Encoding Example (Cont.)

x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
Weight 15213 -15213
1 1 1 1 1
2 0 0 1 2
4 1 4 0 0
8 1 8 0 0
16 0 0 1 16
32 1 32 0 0
64 1 64 0 0
128 0 0 1 128
256 1 256 0 0
512 1 512 0 0
1024 0 0 1 1024
2048 1 2048 0 0
4096 1 4096 0 0
8192 1 8192 0 0
16384 0 0 1 16384
-32768 0 0 1 -32768
Sum 15213 -15213
Carnegie Mellon
Numeric Ranges
 Unsigned Values
 Two’s Complement Values
▪ UMin = 0
▪ TMin = –2w–1
000…0
100…0
▪ UMax = 2w –1
▪ TMax = 2w–1 – 1
111…1
011…1
▪ Minus 1
111…1
Values for W = 16
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 -1 FF FF 11111111 11111111
0 0 00 00 00000000 00000000

Carnegie Mellon
Values for Different Word Sizes

W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808
 Observations  C Programming
▪ |TMin | = TMax + 1 ▪ #include <limits.h>
▪ Asymmetric range ▪ Declares constants, e.g.,
▪ UMax = 2 * TMax + 1 ▪ ULONG_MAX
▪ Question: abs(TMin)? ▪ LONG_MAX
▪ LONG_MIN
▪ Values platform specific

Carnegie Mellon
Unsigned & Signed Numeric Values

X B2U(X) B2T(X)  Equivalence
0000 0 0 ▪ Same encodings for nonnegative
0001 1 1 values
0010 2 2
 Uniqueness
0011 3 3
0100 4 4 ▪ Every bit pattern represents
0101 5 5 unique integer value
0110 6 6 ▪ Each representable integer has
0111 7 7 unique bit encoding
1000 8 –8   Can Invert Mappings
1001 9 –7
1010 10 –6
▪ U2B(x) = B2U-1(x)
1011 11 –5 ▪Bit pattern for unsigned
1100 12 –4 integer
1101 13 –3 ▪ T2B(x) = B2T-1(x)
1110 14 –2 ▪ Bit pattern for two’s comp
1111 15 –1 integer
Carnegie Mellon
Quiz Time!
Check out:
https://canvas.cmu.edu/courses/23122/quizzes/61
559

Carnegie Mellon

 Integers
▪ Summary

Carnegie Mellon
Mapping Between Signed & Unsigned
Two’s Complement Unsigned

T2U
x T2B B2U ux
X
Maintain Same Bit Pattern
Unsigned U2T Two’s Complement

ux U2B B2T x
X
 Mappings between unsigned and two’s complement numbers:

Keep bit representations and reinterpret
Carnegie Mellon
Mapping Signed  Unsigned

Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
T2U
0110 6 6
0111 7 U2T 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Carnegie Mellon
Mapping Signed  Unsigned

Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4
= 4
0101 5 5
0110 6 6
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5
+/- 16 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Carnegie Mellon
Relation between Signed & Unsigned
Two’s Complement Unsigned

T2U
x T2B B2U ux
X
w–1 0
ux + + + ••• +++
x - ++ ••• +++
Large negative weight

becomes
Large positive weight
Carnegie Mellon
Conversion Visualized
 2’s Comp. → Unsigned
▪ Ordering Inversion UMax
▪ Negative → Big Positive
UMax – 1
TMax + 1 Unsigned
TMax TMax Range
2’s Complement
0 0
Range
–1
–2
TMin
Carnegie Mellon
Signed vs. Unsigned in C

 Constants
▪ By default are considered to be signed integers
▪ Unsigned if have “U” as suffix
0U, 4294967259U
 Casting
▪ Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
▪ Implicit casting also occurs via assignments and procedure calls

tx = ux; int fun(unsigned u);
uy = ty; uy = fun(tx);

Carnegie Mellon
Casting Surprises
 Expression Evaluation
▪ If there is a mix of unsigned and signed in single expression,
signed values implicitly cast to unsigned
▪ Including comparison operations <, >, ==, <=, >=
▪ Examples for W = 32: TMIN = -2,147,483,648 , TMAX = 2,147,483,647
 Constant1 Constant2 Relation Evaluation
0 0 0U
0U == unsigned
-1 -1 00 < signed
-1 -1 0U
0U > unsigned
2147483647
2147483647 -2147483647-1
-2147483648 > signed
2147483647U
2147483647U -2147483647-1
-2147483648 < unsigned
-1 -1 -2
-2 > signed
(unsigned)-1
(unsigned) -1 -2
-2 > unsigned
2147483647
2147483647 2147483648U
2147483648U < unsigned
2147483647
2147483647 (int)2147483648U
(int) 2147483648U > signed
Carnegie Mellon
Summary
Casting Signed ↔ Unsigned: Basic Rules
 Bit pattern is maintained
 But reinterpreted
 Can have unexpected effects: adding or subtracting 2w
 Expression containing signed and unsigned int

▪ int is cast to unsigned!!

Carnegie Mellon

 Integers
▪ Summary

Carnegie Mellon
Sign Extension
 Task:
▪ Given w-bit signed integer x
▪ Convert it to w+k-bit integer with same value
 Rule:
▪ Make k copies of sign bit:
▪ X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
k copies of MSB w
X •••
•••
X ••• •••
k
w 38
Carnegie Mellon
Sign Extension: Simple Example
Positive number Negative number
-16 8 4 2 1 -16 8 4 2 1
10 = 0 1 0 1 0 -10 = 1 0 1 1 0
-32 16 8 4 2 1 -32 16 8 4 2 1
10 = 0 0 1 0 1 0 -10 = 1 1 0 1 1 0

Carnegie Mellon
Larger Sign Extension Example

short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;
Decimal Hex Binary

x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011
 Converting from smaller to larger integer data type

 C automatically performs sign extension

Carnegie Mellon
Truncation
 Task:
▪ Given k+w-bit signed or unsigned integer X
▪ Convert it to w-bit integer X’ with same value for “small enough” X
 Rule:
▪ Drop top k bits:
▪ X  = xw–1 , xw–2 ,…, x0
k w
X ••• •••
•••
X •••
w 41
Carnegie Mellon
Truncation: Simple Example

No sign change Sign change
-16 8 4 2 1 -16 8 4 2 1
2 = 0 0 0 1 0 10 = 0 1 0 1 0
-8 4 2 1 -8 4 2 1
2 = 0 0 1 0 -6 = 1 0 1 0
2 mod 16 = 2 10 mod 16 = 10U mod 16 = 10U = -6
-16 8 4 2 1 -16 8 4 2 1
-6 = 1 1 0 1 0 -10 = 1 0 1 1 0
-8 4 2 1 -8 4 2 1
-6 = 1 0 1 0 6 = 0 1 1 0
-6 mod 16 = 26U mod 16 = 10U = -6 -10 mod 16 = 22U mod 16 = 6U = 6

Carnegie Mellon
Summary:
Expanding, Truncating: Basic Rules
 Expanding (e.g., short int to int)
▪ Unsigned: zeros added
▪ Signed: sign extension
▪ Both yield expected result
 Truncating (e.g., unsigned to unsigned short)

▪ Unsigned/signed: bits are truncated
▪ Result reinterpreted
▪ Unsigned: mod operation
▪ Signed: similar to mod
▪ For small (in magnitude) numbers yields expected behavior

Carnegie Mellon
Summary of Today: Bits, Bytes, and Integers

 Integers
 Summary

02 Bits Ints Part1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

02 Bits Ints Part1

Uploaded by

Copyright:

Available Formats

Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1

Bits, Bytes and Integers – Part 1

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2

 Lab 0 will be available via course web page and Autolab.

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3

Today: Bits, Bytes, and Integers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4

An Amazing & Successful Abstraction.

(which we won’t dig into in 213)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5

For example, can count in binary

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7

Encoding Byte Values

15213: 0011 1011 0110 1101

Example Data Representations

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9

Example Data Representations

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10

Today: Bits, Bytes, and Integers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11

Not Exclusive-Or (Xor)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12

General Boolean Algebras

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13

Example: Representing & Manipulating Sets

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15

▪ 0x69 | 0x55 → 0x7D

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16

Contrast: Logic Operations in C

▪ 0x69 && 0x55 → 0x01

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17

▪ Logical shift << 3 00010000

Today: Bits, Bytes, and Integers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19

short int x = 15213;

Two-complement: Simple Example

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21

Two-complement Encoding Example (Cont.)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23

Values for Different Word Sizes

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24

Unsigned & Signed Numeric Values

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26

Today: Bits, Bytes, and Integers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27

Mapping Between Signed & Unsigned

Two’s Complement Unsigned

Unsigned U2T Two’s Complement

 Mappings between unsigned and two’s complement numbers:

Mapping Signed  Unsigned

Mapping Signed  Unsigned

Relation between Signed & Unsigned

Two’s Complement Unsigned

Large negative weight

Signed vs. Unsigned in C

▪ Implicit casting also occurs via assignments and procedure calls

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 33

 Expression containing signed and unsigned int

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36

Today: Bits, Bytes, and Integers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 37