Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Matthew Allen Phillips

2 June 2016
TCSS 371 Machine Organization
UWT Institute of Technology

Information Representations in Computers

Computers are a massive, surrounding entity of mankind, used in aspects ranging from

household appliances to nuclear reactor management. As humans, computers are an assistance to

us in information technology: any information that can be represented as a quantification can be

stored, accessed, and displayed. How do computers store that data? The fact that data in

computers can be sparsely encoded has been evidenced in many concepts, such as image

compression. It is empirically pursued by evidence of many attempts at efficient instruction set

architectures. To save the most space in computers, thus saving the most time, power, and

money, all data types in computers should be represented as efficiently as possible around (or as

an influence on) the computers architecture. Throughout the computers memory hierarchy,

hardware-centered data storage design is prominent (Harder 1959). We will discuss common

information storage traits among modern technology, and include a brief history of evolving data

standards.

Number theory plays a crucial role in elementary computer data. Most modern computer

systems do not represent numeric values using the decimal system. Instead, they typically use a

two's complement numbering system consisting of signed or unsigned binary numbers. Binary

digits, or bits, are assigned a value of either 0 or +1. For this discussion, we will consider

binary number systems in classical computers. In the computers most basic storage elements

(those elements that involve the storage of elements, such as the D-latch or RS-latch in binary

systems), information is gated within temporary memory. Alternatively, creating a system whose
components have more than two states such as ternary systems allows more data to be stored

in less space, but requires more precision and complexity (Cahn 1953). If we consider ternary

storage units momentarily, we observe that individual logical units would require much more

power in such a case. Ternary storage units would exist in a balanced state (-1, 0, or +1) or an

unbalanced state (0, +1, or +2). Nonetheless, quaternary systems (and higher-order systems) are

still found. Such systems must be compatible to binary systems simply because the rest of the

world uses it. Rarely are systems with three states implemented due to expensive conversions:

multiplications with remainders would be frequently prevalent. Instead, binary and higher

powers of two dominate mainstream computing. Finally, at the lowest level, modern binary

computers are electronic machines. For any volatile memory unit, there either is the absence (0.0

V 0.5 V) or presence (2.4 V 2.9 V) of charge. This is conceptually represented as a binary

zero or one, respectively. We see similar binary representations of electrons in wires and

transistors. For instance, the Arithmetic Logic Unit of the computer utilizes the flow of electrons

to perform bit operations.

What kind of data can bits represent? In classical computers, all integer data types are

represented with twos complement binary numbers. Text, images, sound, video, and most other

forms of data can be converted into a bit string. The notion of number complements has long

existed in mathematics since decimal adding machines and calculators. The computational

implementation of twos complement binary storage was first suggested in John von Neumanns

famous First Draft. The system was introduced to computers in 1949 with the von Neumann-

inspired EDSAC, an early electronic machine, and has been used in most subsequent computers.

Along with valid byte representations of positive and negative integers, twos complement

representation allowed for decimal equivalents via base-2 counting.


By 1951, floating-point storage and arithmetic was implemented on a newer model of the

EDSAC to represent fractional decimal values. It wasnt until 1985, however, that floating-point

computation was standardized under IEEE 754. The standard defines arithmetic formats,

interchange formats, rounding rules, operations, and exception handling. Under IEEE 754,

floating-point numbers are represented in memory with one sign bit, eight unsigned bits for the

exponent, and 23 unsigned bits for the fraction. Double precision floating-point numbers are

formatted similarly, but with 11 exponent bits and 52 fraction bits. In addition, floating-point

numbers are not only represented unlike decimal numbers; they are operated on by a math

coprocessor designed to perform arithmetic on floats, called the Floating-Point Unit (FPU).

In the early 1960s, a need emerged to create a standard for general symbol

representation. The American Standards Association (ASA) began work on an encoded standard

based on telegraph code. The system, now known as the American Standard Code for

Information Interchange (ASCII), contains 128 unique glyph mappings, including numbers,

basic punctuation symbols, controller codes, whitespace, and both upper and lower case English

letters. Some controller characters, such as those needed for a typewriter, are now obsolete with

modern computers due to digital irrelevancy. In the computers memory, ASCII characters are

also represented by unsigned binary numbers. Interestingly, early ASCII encoding considered

blocks of data 7 bits in size to allow decimal representations to range from zero to 127. However,

8-bit encoding became ubiquitous in the coming decades. Today, ASCII characters require 8-bit

encodings, even though the character data can fit inside of 7 bits.

How are bits stored in secondary digital mediums? Typically, memory is categorized into

volatile and non-volatile memory (Brooks et. al. 1959): is there storage even if the memory is not

constantly supplied with electric power? Additionally, we also classify the mutability of
information: can the bits be altered or reassigned? In CD-ROMs, for example, bits are stored as

refractive index changes. These discs contain microscopic indents and flats, representing

binary data. When the computer reads the disc with a laser light, refractions of light

communicate to the disc reader. Because CD-ROM disc indents are shallower than the

wavelength of the 780-nm laser light, the shift in refraction allows binary to be communicated.

(It is not difficult to see complex implications of non-binary CD-ROM systems here). CD-ROM

discs are a great example of immutable storage. Disks are user-friendly, but cannot be written to

or erased by the user thus, they are read-only. Alternatively, consider another common disk

storage, the mutable hard disk drive (HDD). Users may write massive amounts of data to an

HDD. The HDD includes magnetic regions individually comprised of magnetic grains. Each

magnetic region overall forms a dipole which produces a magnetic field. Using bit streams from

memory, the HDD head can read bits from the desired location, or produce an electronic current

to alter the dipole of the magnetic region. Both HDD and CD-ROM storage units are non-volatile

memory. Humans may pick them up, replace them, and even keep them stowed away for long

amounts of time.

To modern software application developers, the hard-sought structured frameworks

behind computer data storage would not be helpful without a link to data in high-level languages.

In Java, C#, and other high-level programming languages, programmers are supplied with a set

of optimized value-type variables. (Usually, these include integer types, characters, booleans,

floating point numbers, and several more types, depending on which types the language should

optimize). Reference-types, or pointers to dynamic memory, are stored within the stack. In high-

level languages, complex data types (including objects) are stored in memory, or the heap, of an

application. References to objects are stored on the stack as memory addresses, or integer
memory pointers. This representation directly maps stack references to allocated memory. Some

high-level languages allow the programmer to denote reference-type variables with access

restrictions and reassignment restrictions; both modifiers are stored on the stack with the

reference. Sequential reference-type objects, such as strings, arrays, and lists, are stored

consecutively in memory. Language decomposition shows that designers choose different ways

to store these sequences; however, one of the most common formats is to store the address of the

object in the (garbage collected) heap, the element type, and the length of the array in order.

Additionally, high-level language virtual machines create variance in informational object

storage. Objects are represented in computer storage by state and metadata. Most objects that are

compiled include a class pointer: a pointer to the instances class information, which describes

the object type. When the instance is referenced in a program, its templated information can be

accessed via the class pointer. Flags are also stored in metadata to represent and describe the

state of the object, including a hash code (if applicable). The shape of the object is always

applicable in metadata (e.g., whether or not the object is an array).

Additionally, when one imagines data inside of computers, one may imagine program

storage of the newest Microsoft Office Suite, or the vast compression of a local 3-D video. A

standard issue PC is only one small fraction of existing computational devices. For maximum

performance efficiency in any device, hardware must be constructed with data types in mind.

Both hardware and software developers must also consider the implications of their designs

regarding data representations. Data format should reflect the user stories, use cases, and thus

applicable models of the device. For instance, consider mobile devices. Smartphones usually

store contacts. A contact may be comprised of a first name and last name (each as ASCII string),

and a phone number (as a long integer, or perhaps a PhoneNumber object with multiple
integers). Platform developers for smartphones consider contacts as one anticipated datatype

stored in the device; this simplifies software built on the platform. By providing a simplistic view

of storing object data for mobile device the speed of both synchronization and usage of that data

on mobile devices should increase.

Sources

[1] Brooks, F. P., Blaauw, G. A., & Buccholz, W. (1959, June). Processing Data in Bits and

Pieces [Scholarly project]. In IEEE. Retrieved June 1, 2017, from

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5219512

[2] Harder, E. L. (1959, May). Computers and Automation [Scholarly project]. In IEEE.

Retrieved May, 30, from http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6432562

[3] L. Cahn (1953, December). Accuracy of an Analog Computer [Scholarly project]. In IEEE.

Retrieved May 29, 2017, from http://ieeexplore.ieee.org/document/5407689/

You might also like