Topic 3: Names, Bindings & Scope: Marriette Katarahweire

Topic 3: Names, Bindings & Scope
Marriette Katarahweire
CSC 3112: Principles of Programming Languages 1/58

Outline
fundamental semantics of variables

nature of names and special words
variable attributes
binding and binding times
static and dynamic scoping
named constants and variable initialization
referencing environment of a statement

Fundamental Variable Semantics
imperative languages have abstractions of the von Neumann

computer architecture
primary components of the architecture: memory and
processor
variables: abstractions in a language for the memory cells of a
machine
variable characterized by a collection of properties: attributes
e.g. type
2 main issues considered when designing data types of a PL:
- scope
- lifetime of variables

Fundamental Variable Semantics
Functional programming languages:

allow expressions to be named
the named expressions appear like assignments to variables in
imperative languages
are different though, cannot be changed
are like named constants in imperative languages

Names
fundamental attribute of variables

associated with subprograms, formal parameters and other
program constructs
identifier used interchangebly with named
primary design issues for names
- case sensitivity: are names case sensitive
- special words of the PL: are they keywords or reserved
words?

Name Lengths
name: string of characters used to identify some entity in the

program
Fortran 95+ allows up to 31 characters in its names
C99 no length limitations on its internal names but only the
first 63 characters are significant
- external names (defined outside functions) must be handled
by the linker are restricted to 31 characters
Java, C#, Ada: no length limit on names and all characters
are significant
C++: no length limit on names but implementors sometimes
do

Name Forms
Names in most PLs have some form:

a letter followed by a string of letters, digits and underscore
names with underscore less popular now. Replaced with camel
notation in C-based languages e.g myProgram vs my program
use of underscore and mixed case in names is a programming
style not a language design issue
PHP: all variable names begin with $
Perl: special character at the beginning of the variable name
specifies its type i.e $, @ or %
Ruby: special character at beginning of variable name
- @: instance variable
- @@: class variable

Names: Case Sensitivity
many languages, especially C-based are case sensitive

uppercase and lowercase letters in names are distinct e.g in
Java, rose, Rose, ROSE are distinct
makes readability difficult
C: name case-sensitivity problems avoided by not allowing
uppercase letters in variable names

Special Words
used to make programs more readable by naming actions to

be performed
maybe reserved words or keywords
reserved words: cannot be redefined by programmers
keywords: can be redefined

Keywords
a word of a PL that is special only in certain contexts

Fortran: only widely used language whose special words are
keywords
Fortran example
- Integer Apple
- Integer = 4
Fortran compilers and program readers must distinguish
between names and special words by context

Reserved Words
special word of a PL that cannot be used as a name

better than keywords as a language design choice, ability to
redefine keywords can be confusing
example Fortran
- Integer Real
- Real Integer
language should not have a large number of reserved words,
increases user difficulty in making up names not reserved
words
Question: are names in Java packages reserved words?

Variables
program variable: an abstraction of a computer memory cell

or collection of cells
6-tuple of attributes: name, address, value, type, lifetime and
scope
names: most common names. most variables have names
address: machine memory address with which the variable is
associated
l-value of a variable: is the address of the variable
- what is required when the name of a variable appears in the
left side of an asssignment statement
possible to have multiple variables with the same address

Aliases
variables whose names can be used to access the same

memory location
hinders readability: allows a variable to have its value changed
by an assignment to a different variable
example total and sum, change in the value of total also
changes the value of sum and vice versa
reader must always remember that total and sum are different
names for the same memory cells
aliases created through unions in C and C++, with pointers
pointing to the same memory location, reference variables and
subprogram parameters
the time when a variable becomes associated with an address
is important in PLs

Variable Type
determines the range of values the variable can store

determines the set of operations that are defined for values of
the type
example: Java int
- value range: -2147483648 to 2147483647
- arithmetic operations: +, -, *, /, %

Variable Value
contents of the memory cells associated with the variables

its r-value: what is required when the name of the variable
appears on the right side of an assignment statement
to access the variable’s r-value, the l-value must be
determined first

Binding
A binding is an association between an attribute and an entity

like between a variable and its type or value or between an
operation and a symbol.
Binding time is the time at which a binding takes place.
Possible binding times:
Language design time: bind operator symbols to operations.
* is bound to the multiplication operation
Language implementation time: A data type such as int in
C is bound to a range of possible values
Compile time: bind a variable to a particular data type
Load time: bind a FORTRAN 77 variable to a memory cell
(or a C static variable)
Link time: a call to a library subprogram is bound to the
subprogram code at link time
Runtime: bind a nonstatic local variable to a memory cell

Binding Time Example
Java statement count = count + 5

The type of count is bound at compile time
The set of possible values of count is bound at compiler
design time
The meaning of the operator symbol + is bound at compile
time, when the types of its operands have been determined
The internal representation of the literal 5 is bound at
compiler design time
The value of count is bound at execution time with this
statement

Binding of Attributes to Variables
A binding is static if it first occurs before run time and

remains unchanged throughout program execution
A binding is dynamic if it first occurs during execution or can
change during execution of the program

Type Bindings
Before a variable can be referenced in a program, it must be bound

to a data type. Important aspects:
How is a type specified?
When does the binding take place?
- An explicit declaration is a program statement used for declaring
the types of variables
- An implicit declaration is a default mechanism for specifying
types of variables (the first appearance of the variable in the
program)
- Both explicit and implicit declarations create static bindings to
types

Example
FORTRAN, PL/I, BASIC, and Perl provide implicit

declarations
Advantage: writability
Disadvantage: reliability (less trouble with Perl) because they
prevent the compilation process from detecting some
typographical and programming errors
naming conventions used, the compiler or interpreter binds a
variable to a type based on the syntactic form of the variable’s
name
- In Perl, names that begin with $ is a scalar, if a name begins
with @ it is an array, if it begins with %, it is a hash
structure.the names @apple and %apple are unrelated
- In Fortran, vars that are accidentally left undeclared are
given default types and unexpected attributes, which could
cause subtle errors that, are difficult to diagnose

Example
In Fortran, an identifier that appears in a program that is not

explicitly declared is implicitly declared according to the
following convention: I, J, K, L, M, or N or their lowercase
versions is implicitly declared to be Integer type; otherwise, it
is implicitly declared as Real type
In C and C++, one must distinguish between declarations and
definitions
Declarations specify types and other attributes but do no
cause allocation of storage. Provides the type of a var defined
external to a function that is used in the function
Definitions specify attributes and cause storage allocation

Type Inference
kind of implicit type declarations that uses context

In the simpler case, the context is the type of the value
assigned to the variable in a declaration statement
Example: in C# a var declaration of a variable must include
an initial value, whose type is made the type of the variable
var sum = 0;
var total = 0.0;
var name = "Fred";
- The types of sum, total, and name are int, float, and string,
respectively. These are statically typed variables—their types
are fixed for the lifetime of the unit in which they are declared.

Type Inference: Example
Visual BASIC 9.0+, Go, and the functional languages ML,

Haskell, OCaml and F# also use type inferencing
fun circumf(r) = 3.14159 * r * r;
- function takes a real argument and produces a real result

The types are inferred from the type of the constant
fun times10(x) = 10 * x;
- The argument and functional value are inferred to be int

Dynamic Type Binding
Specified through an assignment statement

PLs JavaScript and PHP
e.g., JavaScript
list = [2, 4.33, 6, 8];
list = 17.3; // list would become a scalar variable
Advantage: flexibility (generic program units)

Disadvantages:
- High cost
- Less reliability
Type error detection by the compiler is difficult because any
var can be assigned a value of any type. Incorrect types of
right sides of assignments are not detected as errors; rather,
the type of the left side is simply changed to the incorrect
type.

High Cost
High cost (dynamic type checking and interpretation)

Every variable must have a descriptor associated with it to
maintain the current type.
the storage used for the value of a variable must be of a
varying size, because different type values require different
amounts of storage.
Dynamic type bindings must be implemented using pure
interpreter not compilers. It is not possible to create machine
code instructions whose operand types are not known at
compile time. Pure interpretation typically takes at least ten
times as long as to execute equivalent machine code.

Example:
i, x //Integer
y //floating-point array
i = x //what the user meant to type
i = y //what the user typed instead
No error is detected by the compiler or run-time system. i is

simply changed to a floating-point array type. The result is
erroneous
In a static type binding language, the compiler would detect
the error and the program would not get to execution

Storage Bindings and Lifetime
Allocation: process of taking a memory cell from a pool of

available memory to bind to a variable
Deallocation: process of placing a memory cell that has been
unbound from a variable back into the pool of available
memory
lifetime of a variable: the time during which the variable is
bound to a specific memory location.Begins when it is bound
to a specific cell and ends when it is unbound from that cell
4 categories of variables according to their lifetimes:
static variables
stack-dynamic variables
explicit heap-dynamic variables
implicit heap-dynamic variables

Static Variables
variables that are bound to memory cells before program

execution begins and remain bound to those same memory
cells until program execution terminates
several valuable applications in programming
Globally accessible variables are often used throughout the
execution of a program, thus making it necessary to have them
bound to the same storage during that execution
subprograms that are history sensitive. Such a subprogram
must have local static variables
Advantages: Efficiency
All addressing of static variables can be direct, other kinds of
variables often require indirect addressing, which is slower.
no run-time overhead is incurred for allocation and deallocation
of static variables, although this time is often negligible

Memory : Stack vs Heap
Stack: a special region of your computer’s memory that

stores temporary variables created by each function (including
the main() function).
The stack is a ”LIFO” (last in, first out) data structure,
managed and optimized by the CPU quite closely.
Every time a function declares a new variable, it is ”pushed”
onto the stack. Every time a function exits, all of the variables
pushed onto the stack by that function, are freed/deleted.
Once a stack variable is freed, that region of memory becomes
available for other stack variables.
Advantage: memory is managed for you. You don’t have to
allocate memory by hand, or free it once you don’t need it any
more.

Memory : Stack
Because the CPU organizes stack memory so efficiently,

reading from and writing to stack variables is very fast.
When a function exits, all of its variables are popped off of
the stack (and hence lost forever). Thus stack variables are
local in nature. This is related to a concept of variable/local
scope vs global variables.
A common bug in C programming is attempting to access a
variable that was created on the stack inside some function,
from a place in your program outside of that function (i.e.
after that function has exited).
there is a limit (varies with OS) on the size of variables that
can be stored on the stack. This is not the case for variables
allocated on the heap.

Heap: a region of your computer’s memory that is not

managed automatically for you, and is not as tightly managed
by the CPU.
It is a more free-floating region of memory (and is larger).
To allocate memory on the heap, you must use malloc() or
calloc(), which are built-in C functions.
Once you have allocated memory on the heap, you are
responsible for using free() to deallocate that memory once
you don’t need it any more. If you fail to do this, your
program will have what is known as a memory leak.
memory leak: memory on the heap will still be set aside (and
won’t be available to other processes).

Memory : Heap
Unlike the stack, the heap does not have size restrictions on
variable size (apart from the obvious physical limitations of
your computer).
Heap memory is slightly slower to be read from and written
to, because one has to use pointers to access memory on the
heap.
Unlike the stack, variables created on the heap are accessible
by any function, anywhere in your program. Heap variables
are essentially global in scope.

What are the pros and cons?

When should you use the heap, and when should you use the
stack?

Static Variables
Disadvantages (to storage):

reduced flexibility; a language that has only static variables
cannot support recursive subprograms
storage cannot be shared among variables. For example, if a
program has two subprograms, both of which require large
arrays. Furthermore, suppose that the two subprograms are
never active at the same time. If the arrays are static, they
cannot share the same storage for their arrays
C and C++: static specifier on a variable definition in a
function, making the variables it defines static.
static modifier in the declaration of a variable in a class
definition in C++, Java, and C#, implies that the variable is
a class variable, rather than an instance variable. Class
variables are created statically some time before the class is
first instantiated

Stack-dynamic variables
variables whose storage bindings are created when their

declaration statements are elaborated, but whose types are
statically bound
elaboration of a declaration refers to the storage allocation
and binding process indicated by the declaration, which takes
place when execution reaches the code to which the
declaration is attached. occurs during run time
Example: the variable declarations that appear at the
beginning of a Java method are elaborated when the method
is called and the variables defined by those declarations are
deallocated when the method completes its execution
are allocated from the run-time stack

Explicit Heap-Dynamic Variables
nameless (abstract) memory cells that are allocated and

deallocated by explicit run-time instructions written by the
programmer.
are allocated from and deallocated to the heap
can only be referenced through pointer or reference variables

Implicit Heap-Dynamic Variables
are bound to heap storage only when they are assigned values.
all their attributes are bound every time they are assigned

Question
What are the advantages and disadvantages of having:

implicit heap-dynamic variables
explicit heap-dynamic variables
stack dynamic variables
static variables

Scope
scope of a variable is the range of statements in which the

variable is visible.
a variable is visible in a statement if it can be referenced in
that statement
Local variables: a variable is local in a program unit or block
if it is declared there
Nonlocal variables: e.g. global variables. nonlocal variables
of a program unit or block are those that are visible within the
program unit or block but are not declared there

Static Scoping
binding names to nonlocal variables

the scope of a variable can be statically determined i.e prior to
execution
permits a human program reader (and a compiler) to
determine the type of every variable in the program simply by
examining its source code
2 categories of static-scoped languages:
those in which methods can be nested, which creates nested
static scopes
those in which methods cannot be nested
Ada, JavaScript, F#, and Python allow nested methods, but
the C-based languages do not
When the reader of a program finds a reference to a variable,
the attributes of the variable can be determined by finding the
statement in which it is declared (either explicitly or implicitly)

Nested Static Scopes
Suppose a reference is made to a variable x in method m1

The correct declaration is found by first searching the
declarations of method m1
If no declaration is found for the variable there, the search
continues in the declarations of the method that declared
method m1, which is called its static parent.
If a declaration of x is not found there, the search continues
to the next-larger enclosing unit (the unit that declared m1’s
parent), and so forth, until a declaration for x is found or the
largest unit’s declarations have been searched without success.
In that case, an undeclared variable error is reported.
The static parent of method m1, and its static parent, and so
forth up to and including the largest enclosing method, are
called the static ancestors of m1

Nested Static Scopes: Example
Given the JavaScript function big below, which x is referenced in

function sub2?
function big() {
function sub1() {
var x = 7;
sub2();
}
function sub2() {
var y = x;
}
var x = 3;
sub1();
}

Nested Static Scopes: Example
some variable declarations can be hidden from some other code

segments. e.g which x is referenced in sub1?
function big() {
function sub1() {
var x = 7;
sub2();
}
function sub2() {
var y = x;
}
var x = 3;
sub1();
}

Blocks
Many languages allow new static scopes to be defined in the

midst of executable code
allows a section of code to have its own local variables whose
scope is minimized.
the variables are stack dynamic, i.e their storage is allocated
when the section is entered and deallocated when the section
is exited. Such a section of code is called a block
C-based languages allow any compound statement (a
statement sequence surrounded by matched braces) to have
declarations and thus define a new scope. Such compound
statements are called blocks

Blocks: Example
if (list[i] < list[j]) {

int temp;
temp = list[i];
list[i] = list[j];
list[j] = temp;
}

Blocks: Example 2
void sub() {
int count;
. . .
while (. . .) {
int count;
count++;
. . .
}
. . .
}

Dynamic Scope
based on the calling sequence of subprograms not on their

spatial relationships with each other
scope can only be determined at runtime
Perl and Common LISP
what is the value of x in sub2 below?
function big() {
function sub1() {
var x = 7;
}
function sub2() {
var y = x;
var z = 3;
}
var x = 3;
}
Dynamic Scoping
One way the correct meaning of x can be determined during

execution is to begin the search with the local declarations.
This is also the way the process begins with static scoping, but
that is where the similarity between the two techniques ends.
When the search of local declarations fails, the declarations of
the dynamic parent, or calling function, are searched. If a
declaration for x is not found there, the search continues in
that function’s dynamic parent, and so forth, until a
declaration for x is found. If none is found in any dynamic
ancestor, it is a run-time error.

Dynamic Scope: Example
Given the JavaScript function big below, which x is referenced in

function sub2?
function big() {
function sub1() {
var x = 7;
sub2();
}
function sub2() {
var y = x;
}
var x = 3;
sub1();
}

Referencing Environments
referencing environment of a statement is a collection of

variables that are visible in the statement
In a static-scoped language, it is the local variables plus all of
the visible variables in all of the enclosing scopes.
The referencing environment of a statement is needed while
that statement is being compiled, so code and data structures
can be created to allow references to non-local vars in both
static and dynamic scoped languages.
a subprogram is active if its execution has begun but has not
yet terminated.
In a dynamic-scoped language, the referencing environment is
the local variables plus all visible variables in all active
subprograms

Referencing Environments Example 1
void sub1() {
int a, b;
. . . <------------ 1
} /* end of sub1 */
void sub2() {
int b, c;
. . . . <------------ 2
sub1();
} /* end of sub2 */
void main() {
int c, d;
. . . <------------ 3
sub2();
} /* end of main */

Consider the example program on the previous slide. Assume that

the only function calls are the following: main calls sub2, which
calls sub1. The referencing environments of the indicated program
points are as follows:
Point Referencing Environment

1 a and b of sub1, c of sub2, d of main,
(c of main and b of sub2 are hidden)
2 b and c of sub2, d of main, (c of main is hidden)
3 c and d of main

Consider the following Python skeletal program. What are the

referencing environments at points 1, 2, and 3?
g = 3; # A global
def sub1():
a = 5; # Creates a local
b = 7; # Creates another local
. . . <------------------------------ 1
def sub2():
global g; # Global g is now assignable here
c = 9; # Creates a new local
. . . <------------------------------ 2
def sub3():
nonlocal c: # Makes nonlocal c visible here
g = 11; # Creates a new local
. . . <------------------------------ 3

Point Referencing Environment

1 local a and b (of sub1), global g for reference,
but not for assignment
2 local c (of sub2), global g for both reference
and for assignment
3 nonlocal c (of sub2), local g (of sub3)

Named Constants
variables that are bound to values only once

its value can not be change by assignment or by an input
statement.

Variable Initialization
The binding of a variable to a value at the time it is bound to

storage is called initialization.
Initialization is often done on the declaration statement.
e.g., Java
int sum = 0;

Exercise
In your discussion groups, try out the Review Questions,

Problem Set and Programming Exercises at the end of
Chapter 5.
Write test programs in Java, Python, and C to determine the
scope of a variable declared in a for statement. Specifically,
the code must determine whether such a variable is visible
after the body of the for statement.

Topic 3: Names, Bindings & Scope: Marriette Katarahweire

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 3: Names, Bindings & Scope: Marriette Katarahweire

Uploaded by

Copyright:

Available Formats

Topic 3: Names, Bindings & Scope

CSC 3112: Principles of Programming Languages 1/58

fundamental semantics of variables

CSC 3112: Principles of Programming Languages 2/58

imperative languages have abstractions of the von Neumann

CSC 3112: Principles of Programming Languages 3/58

Functional programming languages:

CSC 3112: Principles of Programming Languages 4/58

fundamental attribute of variables

CSC 3112: Principles of Programming Languages 5/58

name: string of characters used to identify some entity in the

CSC 3112: Principles of Programming Languages 6/58

Names in most PLs have some form:

CSC 3112: Principles of Programming Languages 7/58

many languages, especially C-based are case sensitive

CSC 3112: Principles of Programming Languages 8/58

used to make programs more readable by naming actions to

CSC 3112: Principles of Programming Languages 9/58

a word of a PL that is special only in certain contexts

CSC 3112: Principles of Programming Languages 10/58

special word of a PL that cannot be used as a name

CSC 3112: Principles of Programming Languages 11/58

program variable: an abstraction of a computer memory cell

CSC 3112: Principles of Programming Languages 12/58

variables whose names can be used to access the same

CSC 3112: Principles of Programming Languages 13/58

determines the range of values the variable can store

CSC 3112: Principles of Programming Languages 14/58

contents of the memory cells associated with the variables

CSC 3112: Principles of Programming Languages 15/58

A binding is an association between an attribute and an entity

CSC 3112: Principles of Programming Languages 16/58

Java statement count = count + 5

CSC 3112: Principles of Programming Languages 17/58

A binding is static if it first occurs before run time and

CSC 3112: Principles of Programming Languages 18/58

Before a variable can be referenced in a program, it must be bound

CSC 3112: Principles of Programming Languages 19/58

FORTRAN, PL/I, BASIC, and Perl provide implicit

CSC 3112: Principles of Programming Languages 20/58

In Fortran, an identifier that appears in a program that is not

CSC 3112: Principles of Programming Languages 21/58

kind of implicit type declarations that uses context

CSC 3112: Principles of Programming Languages 22/58

Visual BASIC 9.0+, Go, and the functional languages ML,

- function takes a real argument and produces a real result

- The argument and functional value are inferred to be int

CSC 3112: Principles of Programming Languages 23/58

Specified through an assignment statement

Advantage: flexibility (generic program units)

CSC 3112: Principles of Programming Languages 24/58

CSC 3112: Principles of Programming Languages 25/58

High cost (dynamic type checking and interpretation)

CSC 3112: Principles of Programming Languages 26/58

No error is detected by the compiler or run-time system. i is

CSC 3112: Principles of Programming Languages 27/58

Allocation: process of taking a memory cell from a pool of

CSC 3112: Principles of Programming Languages 28/58

variables that are bound to memory cells before program

CSC 3112: Principles of Programming Languages 29/58

Stack: a special region of your computer’s memory that

CSC 3112: Principles of Programming Languages 30/58