Principle of Programming Language: Composite Data Types

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 20

Principle of Programming

Language
Composite Data Types

Overview
1.
2.
3.
4.
5.
6.

Records and Variant Records


Arrays
Strings
Sets
Pointers And Recursive Types
Lists

Records
Allow related data of heterogeneous types to be
stored and manipulated together
Use contiguous memory location
Possible holes for alignment reasons
Smart compilers may rearrange fields to minimize
holes (C compilers promise not to)
Different terms in

Algol 68, C, C++, and Common Lisp: struct


Java, C++, C#: class
Pascal: record
ML, Python, Ruby: lists (no keyword for the declaration)

Examples
In Pascal:

In C:

type two_chars =
packed array [1..2]
of char;
type element = record

struct element
{

name: two_chars;
atomic_number: integer;
atomic_weight: real;
metallic: Boolean

end;

char name[2];
int atomic_number;
double atomic_weight;
bool metallic;

};

Memory Layout of Records


Likely layout in memory for objects on a 32-bit
machine

Alignment restrictions lead to the shaded


holes.

Packed Records
Pascal allows the programmer to specify that a
record type (or an array, set, or file type) should be
packed:
type element = packed record
name : two_chars;
atomic_number : integer;
atomic_weight : real;
metallic : Boolean

end;

Memory Layout of Packed Records


Likely memory layout for packed records.

The atomic_number and atomic_weight fields are


nonaligned, and can only be read or written via
multi-instruction sequences.

Memory Layout of Rearranged Records


Rearranging record fields to minimize holes.

By sorting fields according to the size of their alignment


constraint, a compiler can minimize the space devoted to
holes, while keeping the fields aligned.

Variant Records
A variant record provides two or more alternative fields
or collections of fields, only one of which is valid at any
given time.
type element = record
name : two_chars;
atomic_number : integer;
atomic_weight : real;
metallic : Boolean;
case naturally_occurring : Boolean of
true :
(
source : string_ptr;
prevalence : real;
);
false :
(
lifetime : real;
)
end;

Memory Layout of Variants


Likely memory layouts for element variants.

The value of the naturally occurring field (shown here with a double
border) determines which of the interpretations of the remaining
space is valid. Type string_ptr is assumed to be represented by a
(four-byte) pointer to dynamically allocated storage.

Arrays
Arrays are the most common and important
composite data types
Unlike records, which group related fields of
disparate types, arrays are usually homogeneous
Semantically, they can be thought of as a
mapping from an index type to a component or
element type
A slice or section is a rectangular portion of an
array.

Arrays

Memory Layout of Arrays


Arrays in most language implementations are stored in
contiguous locations in memory
Like Records, arrays may contain holes due to
alignment requirement
Some languages (e.g., Pascal) allow the programmer
to specify that an array be packed
For multidimensional arrays, there are two layouts:
row-major order and column-major order
In row-major order, consecutive locations in memory hold
elements that differ by one in the final subscript (except at
the ends of rows).
In column-major order, consecutive locations hold elements
that differ by one in the initial subscript

Row- and Column-major Layout

Strings
In some languages, strings have special status, with
operations that are not available for arrays of other sorts.
It is easier to provide special features for strings than for
arrays in general, because strings are one-dimensional.
Manipulation of variable-length strings is fundamental to a
huge number of computer applications.
Particularly powerful string facilities are found in various
scripting languages such as Perl, Python and Ruby.
C, Pascal, and Ada require that the length of a string-valued
variable be bound no later than elaboration time, allowing the
variable to be implemented as a contiguous array of
characters in the current stack frame.
Lisp, Icon, ML, Java, C# allow the length of a string-valued
variable to change over its lifetime, requiring that the variable
be implemented as a block or chain of blocks in the heap.

Strings

Sets
A set is an unordered collection of an arbitrary
number of distinct values of a common type.
Introduced by Pascal, and are found in many more
recent languages as well.
Many ways to implement sets, including arrays,
hash tables, and various forms of trees.
The most common implementation employs a bit
vector whose length (in bits) is the number of
distinct values of the base type.
Operations on bit-vector sets can make use of fast logical
instructions on most machines.
Union is bit-wise or; intersection is bit-wise and; difference
is bit-wise not, followed by bit-wise and.

Pointers And Recursive Types


A recursive type is one whose objects may contain
one or more references to other objects of the type.
Pointers serve two purposes:
Efficient access to elaborated objects (as in C).
Dynamic creation of linked data structures,

In languages like C, Pascal, or Ada, which use a


value model of variables, recursive types require
the notion of a pointer.
In some languages (e.g., Pascal, Ada 83, and
Modula-3), pointers are restricted to point only to
objects in the heap.

Garbage Collection
The language implementation notices when objects
are no longer useful and reclaim them automatically
More or less essential for functional languages
delete is a very important sort of operation
The ability to construct and return arbitrary objects from
functions requires unlimited extent and hence heap
allocation to accommodate it

Popular for imperative languages as well; Java, C#,


and all the major scripting languages.
A typical tradeoff between convenience and safety
on the one hand and performance on the other.

Lists
Defined recursively as either the
empty list or a pair consisting of an
object (which may be either a list or an
atom) and another (shorter) list
Ideally suited to programming in
functional and logic languages.
Several scripting languages, notably
Perl and Python, provide extensive list
support

You might also like