Understanding Endianness

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

ENDIANNESS

What is Endianness?
Endianness refers to the order in which bytes are arranged within larger data types when stored in memory or
when transmitted over networks. It determines how data is interpreted and processed by a computer system.
Why do you need to understand Endianness?
Understanding endianness is crucial for ensuring correct data interpretation across different hardware
architectures and in network communication. Misinterpretation of byte order can lead to data corruption, bugs,
and system crashes. Ensuring compatibility and correctness in data handling across different systems and
networks necessitates a deep understanding of endianness.
What are the types of Endianness?
Little-Endian:
In a little-endian system, the least significant byte (LSB) is stored at the smallest memory address. This means
that for a multi-byte data type, the byte representing the smallest value is stored first, followed by the next
significant byte, and so on. This ordering is often referred to as "LSB 0" because the LSB is stored at offset 0.
Example: For the 4-byte hexadecimal number 0x12345678, the memory representation in a little-endian system
would be:

This order places the LSB (78) at the smallest address (0x00).

Yashwanth Naidu Tikkisetty


Big-Endian:
In a big-endian system, the most significant byte (MSB) is stored at the smallest memory address. This means
that for a multi-byte data type, the byte representing the largest value is stored first, followed by the next
significant byte, and so on. This ordering is often referred to as "MSB 0" because the MSB is stored at offset 0.
Example: For the 4-byte hexadecimal number 0x12345678, the memory representation in a big-endian system
would be:

This order places the MSB (12) at the smallest address (0x00).
The terms "little-endian" and "big-endian" originate from Jonathan Swift's "Gulliver's Travels," where they
describe factions that broke their eggs at different ends. In computing, these terms were popularized by
computer scientist Danny Cohen in his 1980 paper "On Holy Wars and a Plea for Peace," which discussed the
challenges of byte ordering in network communication.

Early Computing Systems


Early computing systems adopted different endianness based on their design philosophies and intended
applications. For example:
• IBM Mainframes: Used big-endian format to align with human-readable byte order.
• DEC VAX: Used little-endian format to simplify hardware design and arithmetic operations.
• Network Protocols: Standardized on big-endian to ensure compatibility across diverse systems.

Bit Order

Regardless of whether a system uses big-endian or little-endian byte order, the bits within each byte are
typically stored in big-endian order. This means that within a byte, the most significant bit (MSB) is on the left,
and the least significant bit (LSB) is on the right.

Example: For the hexadecimal value 4F:

• The binary representation is 01001111.


• This bit order is consistent whether the byte is stored first or last in memory, according to the system's
byte order.

How is Endianness Determined?

Endianness is determined by the architecture of the CPU and the conventions used in the software and protocols
that the system employs.

• CPU Architecture: Some CPUs are hardwired to use a specific endianness (e.g., x86 uses little-endian),
while others, like ARM, can operate in either mode (bi-endian).

Yashwanth Naidu Tikkisetty


• Software Conventions: Software, especially low-level system software like operating systems and
drivers, often adheres to the endianness dictated by the CPU architecture.

• Network Protocols: Network protocols standardize on big-endian format (network byte order) to ensure
consistent data interpretation across diverse systems.

How to programmatically determine the Endianness of a system?


#include <stdio.h>

const char* detect_endianness() {


unsigned int x = 1;
char *c = (char*)&x;
return (*c) ? "Little-endian" : "Big-endian";
}

int main() {
printf("System is %s\n", detect_endianness());
return 0;
}

(unsigned short x = 1) and checks the first byte (char *c = (char*)&x). If the first byte is 1, the system is little-
endian; otherwise, it is big-endian.
How to Handle Endianness in Network Protocols?
Network protocols typically use big-endian format (network byte order) to ensure consistent data interpretation
across different systems. When sending data over a network, you convert it from host byte order to network
byte order using functions like htonl and htons. Similarly, when receiving data, you convert it from network
byte order to host byte order using ntohl and ntohs.
What is Bi-Endian?
Bi-endian processors can operate in either little-endian or big-endian mode. This flexibility allows them to
interact with different systems and networks seamlessly. An example of a bi-endian processor is the ARM
architecture, which can be configured to use either endianness depending on the application requirements.
How can we convert a 32-bit integer from Little-Endian to Big-Endian?
#include<stdio.h>
#include <stdint.h>

uint32_t swap_endian(uint32_t val) {


return ((val >> 24) & 0x000000FF) |
((val >> 8) & 0x0000FF00) |
((val << 8) & 0x00FF0000) |
((val << 24) & 0xFF000000);
}

int main() {
uint32_t little_endian = 0x12345678;
uint32_t big_endian = swap_endian(little_endian);
printf("Little-endian: 0x%08x\n", little_endian);

Yashwanth Naidu Tikkisetty


printf("Big-endian: 0x%08x\n", big_endian);
return 0;
}

Output:

What issues might a Floating point number cause for Endianness?


Floating-point numbers are stored according to the IEEE 754 standard, which defines the format but not the
endianness. When transferring floating-point numbers between systems with different endianness, it's essential
to convert the byte order correctly. Misinterpreting the endianness can lead to incorrect values, such as a 1.23 in
little-endian being interpreted as a completely different value in big-endian.
Over a network transmission, sometimes we might need to convert a 64-bit integer value from host(which might
be in Little-Endian) to Big-Endian. For that:
#include <stdint.h>
#include <stdio.h>

uint64_t htonll(uint64_t val) {


uint32_t high_part = (uint32_t)(val >> 32);
uint32_t low_part = (uint32_t)(val & 0xFFFFFFFF);
high_part = htonl(high_part);
low_part = htonl(low_part);
return ((uint64_t)low_part << 32) | high_part;
}

int main() {
uint64_t host_val = 0x123456789ABCDEF0;
uint64_t network_val = htonll(host_val);
printf("Host byte order: 0x%016lx\n", host_val);
printf("Network byte order: 0x%016lx\n", network_val);
return 0;
}

Output:

Given a byte array in little-endian format, we might need to write a function in C to read a 32-bit integer from
the array and convert it to host byte order:
#include <stdint.h>
#include <stdio.h>

uint32_t read_little_endian(const uint8_t *buffer) {


return (uint32_t)buffer[0] |
((uint32_t)buffer[1] << 8) |

Yashwanth Naidu Tikkisetty


((uint32_t)buffer[2] << 16) |
((uint32_t)buffer[3] << 24);
}

int main() {
uint8_t little_endian_data[4] = {0x78, 0x56, 0x34, 0x12};
uint32_t host_val = read_little_endian(little_endian_data);
printf("Read value: 0x%08x\n", host_val);
return 0;
}

Output:

Processors and Endianness:


x86 Family ( Intel and AMD): Little-Endian
ARM: Little-Endian ( ARM mostly is Bi-Endian)
RISC-V: Little-Endian ( bi-endian capable)
MIPS: Little-endian ( bi-endian capable)
IBM Mainframes (System/360, System/370, z/Architecture): Big-endian
Motorola 68k Family: Big-endian
SPARC (Scalable Processor Architecture): Big-endian (bi-endian capable)
PowerPC (older versions): Big-endian (modern versions are bi-endian)
Internet Protocols (IP, TCP, UDP): Big-endian (network byte order)

Bonus:
How to transmit a 24-bit Integer as a 32-bit Integer?
- Packing a 24-bit Integer into a 32-bit Integer
• Little-endian Format: The 24-bit value is packed into the lower 3 bytes of the 32-bit container, with the
most significant byte (MSB) set to zero.
• Big-endian Format: The 24-bit value is packed into the upper 3 bytes of the 32-bit container, with the
least significant byte (LSB) set to zero.
- Unpacking a 24-bit Integer from a 32-bit Integer
• Little-endian Format: Extract the lower 3 bytes from the 32-bit container.
• Big-endian Format: Extract the upper 3 bytes from the 32-bit container.

Yashwanth Naidu Tikkisetty


Little Endian Packing and Unpacking:
#include <stdint.h>
#include <stdio.h>

// Pack a 24-bit integer into a 32-bit integer (little-endian)


uint32_t pack_24_to_32_little(uint32_t val) {
return val & 0x00FFFFFF; // Mask to ensure only the lower 24 bits are used
}

// Unpack a 24-bit integer from a 32-bit integer (little-endian)


uint32_t unpack_24_from_32_little(uint32_t val) {
return val & 0x00FFFFFF; // Extract the lower 24 bits
}

int main() {
uint32_t val_24 = 0x123456; // Example 24-bit value
uint32_t packed_32 = pack_24_to_32_little(val_24);
uint32_t unpacked_24 = unpack_24_from_32_little(packed_32);

printf("Original 24-bit value: 0x%06x\n", val_24);


printf("Packed 32-bit value: 0x%08x\n", packed_32);
printf("Unpacked 24-bit value: 0x%06x\n", unpacked_24);

return 0;
}

Output:

Big-Endian Packing and Unpacking:


#include <stdint.h>
#include <stdio.h>

// Pack a 24-bit integer into a 32-bit integer (big-endian)


uint32_t pack_24_to_32_big(uint32_t val) {
return (val << 8) & 0xFFFFFF00; // Shift left and mask to ensure the upper 24 bits are used
}

// Unpack a 24-bit integer from a 32-bit integer (big-endian)


uint32_t unpack_24_from_32_big(uint32_t val) {
return (val >> 8) & 0x00FFFFFF; // Shift right and extract the upper 24 bits
}

int main() {
uint32_t val_24 = 0x123456; // Example 24-bit value

Yashwanth Naidu Tikkisetty


uint32_t packed_32 = pack_24_to_32_big(val_24);
uint32_t unpacked_24 = unpack_24_from_32_big(packed_32);

printf("Original 24-bit value: 0x%06x\n", val_24);


printf("Packed 32-bit value: 0x%08x\n", packed_32);
printf("Unpacked 24-bit value: 0x%06x\n", unpacked_24);

return 0;
}

Output:

Transmitting Multiple 24-bit Integers in a 64-bit Container

Little-endian Packing and Unpacking


#include <stdint.h>
#include <stdio.h>

// Pack two 24-bit integers into a 64-bit integer (little-endian)


uint64_t pack_two_24_to_64_little(uint32_t val1, uint32_t val2) {
return ((uint64_t)(val1 & 0x00FFFFFF)) | (((uint64_t)(val2 & 0x00FFFFFF)) << 32);
}

// Unpack two 24-bit integers from a 64-bit integer (little-endian)


void unpack_two_24_from_64_little(uint64_t packed, uint32_t *val1, uint32_t *val2) {
*val1 = (uint32_t)(packed & 0x00FFFFFF);
*val2 = (uint32_t)((packed >> 32) & 0x00FFFFFF);
}

int main() {
uint32_t val1_24 = 0x123456; // Example first 24-bit value
uint32_t val2_24 = 0x789ABC; // Example second 24-bit value
uint64_t packed_64 = pack_two_24_to_64_little(val1_24, val2_24);
uint32_t unpacked_val1_24, unpacked_val2_24;
unpack_two_24_from_64_little(packed_64, &unpacked_val1_24, &unpacked_val2_24);

printf("Original 24-bit values: 0x%06x, 0x%06x\n", val1_24, val2_24);


printf("Packed 64-bit value: 0x%016lx\n", packed_64);
printf("Unpacked 24-bit values: 0x%06x, 0x%06x\n", unpacked_val1_24, unpacked_val2_24);

return 0;
}

Yashwanth Naidu Tikkisetty


Output:

Big-endian Packing and Unpacking


#include <stdint.h>
#include <stdio.h>

// Pack two 24-bit integers into a 64-bit integer (big-endian)


uint64_t pack_two_24_to_64_big(uint32_t val1, uint32_t val2) {
return (((uint64_t)(val1 & 0x00FFFFFF)) << 40) | (((uint64_t)(val2 & 0x00FFFFFF)) << 16);
}

// Unpack two 24-bit integers from a 64-bit integer (big-endian)


void unpack_two_24_from_64_big(uint64_t packed, uint32_t *val1, uint32_t *val2) {
*val1 = (uint32_t)((packed >> 40) & 0x00FFFFFF);
*val2 = (uint32_t)((packed >> 16) & 0x00FFFFFF);
}

int main() {
uint32_t val1_24 = 0x123456; // Example first 24-bit value
uint32_t val2_24 = 0x789ABC; // Example second 24-bit value
uint64_t packed_64 = pack_two_24_to_64_big(val1_24, val2_24);
uint32_t unpacked_val1_24, unpacked_val2_24;
unpack_two_24_from_64_big(packed_64, &unpacked_val1_24, &unpacked_val2_24);

printf("Original 24-bit values: 0x%06x, 0x%06x\n", val1_24, val2_24);


printf("Packed 64-bit value: 0x%016lx\n", packed_64);
printf("Unpacked 24-bit values: 0x%06x, 0x%06x\n", unpacked_val1_24, unpacked_val2_24);

return 0;
}

Output:

Summarized:
Little-endian: Little-End first. Bytes are stored in reverse order, with the least significant byte first. LSB at
Lowest address, MSB at highest address.
Big-endian: Big-End first. Bytes are stored in natural order, with the most significant byte first. MSB at Lowest
address, LSB at Highest Address.

Yashwanth Naidu Tikkisetty


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Article Written By: Yashwanth Naidu Tikkisetty
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Yashwanth Naidu Tikkisetty

You might also like