Professional Documents
Culture Documents
Principle of Programming Language: Composite Data Types
Principle of Programming Language: Composite Data Types
Principle of Programming Language: Composite Data Types
Language
Composite Data Types
Overview
1.
2.
3.
4.
5.
6.
Records
Allow related data of heterogeneous types to be
stored and manipulated together
Use contiguous memory location
Possible holes for alignment reasons
Smart compilers may rearrange fields to minimize
holes (C compilers promise not to)
Different terms in
Examples
In Pascal:
In C:
type two_chars =
packed array [1..2]
of char;
type element = record
struct element
{
name: two_chars;
atomic_number: integer;
atomic_weight: real;
metallic: Boolean
end;
char name[2];
int atomic_number;
double atomic_weight;
bool metallic;
};
Packed Records
Pascal allows the programmer to specify that a
record type (or an array, set, or file type) should be
packed:
type element = packed record
name : two_chars;
atomic_number : integer;
atomic_weight : real;
metallic : Boolean
end;
Variant Records
A variant record provides two or more alternative fields
or collections of fields, only one of which is valid at any
given time.
type element = record
name : two_chars;
atomic_number : integer;
atomic_weight : real;
metallic : Boolean;
case naturally_occurring : Boolean of
true :
(
source : string_ptr;
prevalence : real;
);
false :
(
lifetime : real;
)
end;
The value of the naturally occurring field (shown here with a double
border) determines which of the interpretations of the remaining
space is valid. Type string_ptr is assumed to be represented by a
(four-byte) pointer to dynamically allocated storage.
Arrays
Arrays are the most common and important
composite data types
Unlike records, which group related fields of
disparate types, arrays are usually homogeneous
Semantically, they can be thought of as a
mapping from an index type to a component or
element type
A slice or section is a rectangular portion of an
array.
Arrays
Strings
In some languages, strings have special status, with
operations that are not available for arrays of other sorts.
It is easier to provide special features for strings than for
arrays in general, because strings are one-dimensional.
Manipulation of variable-length strings is fundamental to a
huge number of computer applications.
Particularly powerful string facilities are found in various
scripting languages such as Perl, Python and Ruby.
C, Pascal, and Ada require that the length of a string-valued
variable be bound no later than elaboration time, allowing the
variable to be implemented as a contiguous array of
characters in the current stack frame.
Lisp, Icon, ML, Java, C# allow the length of a string-valued
variable to change over its lifetime, requiring that the variable
be implemented as a block or chain of blocks in the heap.
Strings
Sets
A set is an unordered collection of an arbitrary
number of distinct values of a common type.
Introduced by Pascal, and are found in many more
recent languages as well.
Many ways to implement sets, including arrays,
hash tables, and various forms of trees.
The most common implementation employs a bit
vector whose length (in bits) is the number of
distinct values of the base type.
Operations on bit-vector sets can make use of fast logical
instructions on most machines.
Union is bit-wise or; intersection is bit-wise and; difference
is bit-wise not, followed by bit-wise and.
Garbage Collection
The language implementation notices when objects
are no longer useful and reclaim them automatically
More or less essential for functional languages
delete is a very important sort of operation
The ability to construct and return arbitrary objects from
functions requires unlimited extent and hence heap
allocation to accommodate it
Lists
Defined recursively as either the
empty list or a pair consisting of an
object (which may be either a list or an
atom) and another (shorter) list
Ideally suited to programming in
functional and logic languages.
Several scripting languages, notably
Perl and Python, provide extensive list
support