Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Topic 3: Names, Bindings & Scope

Marriette Katarahweire

CSC 3112: Principles of Programming Languages 1/58


Outline

fundamental semantics of variables


nature of names and special words
variable attributes
binding and binding times
static and dynamic scoping
named constants and variable initialization
referencing environment of a statement

CSC 3112: Principles of Programming Languages 2/58


Fundamental Variable Semantics

imperative languages have abstractions of the von Neumann


computer architecture
primary components of the architecture: memory and
processor
variables: abstractions in a language for the memory cells of a
machine
variable characterized by a collection of properties: attributes
e.g. type
2 main issues considered when designing data types of a PL:
- scope
- lifetime of variables

CSC 3112: Principles of Programming Languages 3/58


Fundamental Variable Semantics

Functional programming languages:


allow expressions to be named
the named expressions appear like assignments to variables in
imperative languages
are different though, cannot be changed
are like named constants in imperative languages

CSC 3112: Principles of Programming Languages 4/58


Names

fundamental attribute of variables


associated with subprograms, formal parameters and other
program constructs
identifier used interchangebly with named
primary design issues for names
- case sensitivity: are names case sensitive
- special words of the PL: are they keywords or reserved
words?

CSC 3112: Principles of Programming Languages 5/58


Name Lengths

name: string of characters used to identify some entity in the


program
Fortran 95+ allows up to 31 characters in its names
C99 no length limitations on its internal names but only the
first 63 characters are significant
- external names (defined outside functions) must be handled
by the linker are restricted to 31 characters
Java, C#, Ada: no length limit on names and all characters
are significant
C++: no length limit on names but implementors sometimes
do

CSC 3112: Principles of Programming Languages 6/58


Name Forms

Names in most PLs have some form:


a letter followed by a string of letters, digits and underscore
names with underscore less popular now. Replaced with camel
notation in C-based languages e.g myProgram vs my program
use of underscore and mixed case in names is a programming
style not a language design issue
PHP: all variable names begin with $
Perl: special character at the beginning of the variable name
specifies its type i.e $, @ or %
Ruby: special character at beginning of variable name
- @: instance variable
- @@: class variable

CSC 3112: Principles of Programming Languages 7/58


Names: Case Sensitivity

many languages, especially C-based are case sensitive


uppercase and lowercase letters in names are distinct e.g in
Java, rose, Rose, ROSE are distinct
makes readability difficult
C: name case-sensitivity problems avoided by not allowing
uppercase letters in variable names

CSC 3112: Principles of Programming Languages 8/58


Special Words

used to make programs more readable by naming actions to


be performed
maybe reserved words or keywords
reserved words: cannot be redefined by programmers
keywords: can be redefined

CSC 3112: Principles of Programming Languages 9/58


Keywords

a word of a PL that is special only in certain contexts


Fortran: only widely used language whose special words are
keywords
Fortran example
- Integer Apple
- Integer = 4
Fortran compilers and program readers must distinguish
between names and special words by context

CSC 3112: Principles of Programming Languages 10/58


Reserved Words

special word of a PL that cannot be used as a name


better than keywords as a language design choice, ability to
redefine keywords can be confusing
example Fortran
- Integer Real
- Real Integer
language should not have a large number of reserved words,
increases user difficulty in making up names not reserved
words
Question: are names in Java packages reserved words?

CSC 3112: Principles of Programming Languages 11/58


Variables

program variable: an abstraction of a computer memory cell


or collection of cells
6-tuple of attributes: name, address, value, type, lifetime and
scope
names: most common names. most variables have names
address: machine memory address with which the variable is
associated
l-value of a variable: is the address of the variable
- what is required when the name of a variable appears in the
left side of an asssignment statement
possible to have multiple variables with the same address

CSC 3112: Principles of Programming Languages 12/58


Aliases

variables whose names can be used to access the same


memory location
hinders readability: allows a variable to have its value changed
by an assignment to a different variable
example total and sum, change in the value of total also
changes the value of sum and vice versa
reader must always remember that total and sum are different
names for the same memory cells
aliases created through unions in C and C++, with pointers
pointing to the same memory location, reference variables and
subprogram parameters
the time when a variable becomes associated with an address
is important in PLs

CSC 3112: Principles of Programming Languages 13/58


Variable Type

determines the range of values the variable can store


determines the set of operations that are defined for values of
the type
example: Java int
- value range: -2147483648 to 2147483647
- arithmetic operations: +, -, *, /, %

CSC 3112: Principles of Programming Languages 14/58


Variable Value

contents of the memory cells associated with the variables


its r-value: what is required when the name of the variable
appears on the right side of an assignment statement
to access the variable’s r-value, the l-value must be
determined first

CSC 3112: Principles of Programming Languages 15/58


Binding

A binding is an association between an attribute and an entity


like between a variable and its type or value or between an
operation and a symbol.
Binding time is the time at which a binding takes place.
Possible binding times:
Language design time: bind operator symbols to operations.
* is bound to the multiplication operation
Language implementation time: A data type such as int in
C is bound to a range of possible values
Compile time: bind a variable to a particular data type
Load time: bind a FORTRAN 77 variable to a memory cell
(or a C static variable)
Link time: a call to a library subprogram is bound to the
subprogram code at link time
Runtime: bind a nonstatic local variable to a memory cell

CSC 3112: Principles of Programming Languages 16/58


Binding Time Example

Java statement count = count + 5


The type of count is bound at compile time
The set of possible values of count is bound at compiler
design time
The meaning of the operator symbol + is bound at compile
time, when the types of its operands have been determined
The internal representation of the literal 5 is bound at
compiler design time
The value of count is bound at execution time with this
statement

CSC 3112: Principles of Programming Languages 17/58


Binding of Attributes to Variables

A binding is static if it first occurs before run time and


remains unchanged throughout program execution
A binding is dynamic if it first occurs during execution or can
change during execution of the program

CSC 3112: Principles of Programming Languages 18/58


Type Bindings

Before a variable can be referenced in a program, it must be bound


to a data type. Important aspects:
How is a type specified?
When does the binding take place?
- An explicit declaration is a program statement used for declaring
the types of variables
- An implicit declaration is a default mechanism for specifying
types of variables (the first appearance of the variable in the
program)
- Both explicit and implicit declarations create static bindings to
types

CSC 3112: Principles of Programming Languages 19/58


Example

FORTRAN, PL/I, BASIC, and Perl provide implicit


declarations
Advantage: writability
Disadvantage: reliability (less trouble with Perl) because they
prevent the compilation process from detecting some
typographical and programming errors
naming conventions used, the compiler or interpreter binds a
variable to a type based on the syntactic form of the variable’s
name
- In Perl, names that begin with $ is a scalar, if a name begins
with @ it is an array, if it begins with %, it is a hash
structure.the names @apple and %apple are unrelated
- In Fortran, vars that are accidentally left undeclared are
given default types and unexpected attributes, which could
cause subtle errors that, are difficult to diagnose

CSC 3112: Principles of Programming Languages 20/58


Example

In Fortran, an identifier that appears in a program that is not


explicitly declared is implicitly declared according to the
following convention: I, J, K, L, M, or N or their lowercase
versions is implicitly declared to be Integer type; otherwise, it
is implicitly declared as Real type
In C and C++, one must distinguish between declarations and
definitions
Declarations specify types and other attributes but do no
cause allocation of storage. Provides the type of a var defined
external to a function that is used in the function
Definitions specify attributes and cause storage allocation

CSC 3112: Principles of Programming Languages 21/58


Type Inference

kind of implicit type declarations that uses context


In the simpler case, the context is the type of the value
assigned to the variable in a declaration statement
Example: in C# a var declaration of a variable must include
an initial value, whose type is made the type of the variable
var sum = 0;
var total = 0.0;
var name = "Fred";

- The types of sum, total, and name are int, float, and string,
respectively. These are statically typed variables—their types
are fixed for the lifetime of the unit in which they are declared.

CSC 3112: Principles of Programming Languages 22/58


Type Inference: Example

Visual BASIC 9.0+, Go, and the functional languages ML,


Haskell, OCaml and F# also use type inferencing
fun circumf(r) = 3.14159 * r * r;

- function takes a real argument and produces a real result


The types are inferred from the type of the constant
fun times10(x) = 10 * x;

- The argument and functional value are inferred to be int

CSC 3112: Principles of Programming Languages 23/58


Dynamic Type Binding

Specified through an assignment statement


PLs JavaScript and PHP
e.g., JavaScript
list = [2, 4.33, 6, 8];
list = 17.3; // list would become a scalar variable

Advantage: flexibility (generic program units)

CSC 3112: Principles of Programming Languages 24/58


Dynamic Type Binding

Disadvantages:
- High cost
- Less reliability
Type error detection by the compiler is difficult because any
var can be assigned a value of any type. Incorrect types of
right sides of assignments are not detected as errors; rather,
the type of the left side is simply changed to the incorrect
type.

CSC 3112: Principles of Programming Languages 25/58


High Cost

High cost (dynamic type checking and interpretation)


Every variable must have a descriptor associated with it to
maintain the current type.
the storage used for the value of a variable must be of a
varying size, because different type values require different
amounts of storage.
Dynamic type bindings must be implemented using pure
interpreter not compilers. It is not possible to create machine
code instructions whose operand types are not known at
compile time. Pure interpretation typically takes at least ten
times as long as to execute equivalent machine code.

CSC 3112: Principles of Programming Languages 26/58


Dynamic Type Binding

Example:

i, x //Integer
y //floating-point array
i = x //what the user meant to type
i = y //what the user typed instead

No error is detected by the compiler or run-time system. i is


simply changed to a floating-point array type. The result is
erroneous
In a static type binding language, the compiler would detect
the error and the program would not get to execution

CSC 3112: Principles of Programming Languages 27/58


Storage Bindings and Lifetime

Allocation: process of taking a memory cell from a pool of


available memory to bind to a variable
Deallocation: process of placing a memory cell that has been
unbound from a variable back into the pool of available
memory
lifetime of a variable: the time during which the variable is
bound to a specific memory location.Begins when it is bound
to a specific cell and ends when it is unbound from that cell
4 categories of variables according to their lifetimes:
static variables
stack-dynamic variables
explicit heap-dynamic variables
implicit heap-dynamic variables

CSC 3112: Principles of Programming Languages 28/58


Static Variables

variables that are bound to memory cells before program


execution begins and remain bound to those same memory
cells until program execution terminates
several valuable applications in programming
Globally accessible variables are often used throughout the
execution of a program, thus making it necessary to have them
bound to the same storage during that execution
subprograms that are history sensitive. Such a subprogram
must have local static variables
Advantages: Efficiency
All addressing of static variables can be direct, other kinds of
variables often require indirect addressing, which is slower.
no run-time overhead is incurred for allocation and deallocation
of static variables, although this time is often negligible

CSC 3112: Principles of Programming Languages 29/58


Memory : Stack vs Heap

Stack: a special region of your computer’s memory that


stores temporary variables created by each function (including
the main() function).
The stack is a ”LIFO” (last in, first out) data structure,
managed and optimized by the CPU quite closely.
Every time a function declares a new variable, it is ”pushed”
onto the stack. Every time a function exits, all of the variables
pushed onto the stack by that function, are freed/deleted.
Once a stack variable is freed, that region of memory becomes
available for other stack variables.
Advantage: memory is managed for you. You don’t have to
allocate memory by hand, or free it once you don’t need it any
more.

CSC 3112: Principles of Programming Languages 30/58


Memory : Stack

Because the CPU organizes stack memory so efficiently,


reading from and writing to stack variables is very fast.
When a function exits, all of its variables are popped off of
the stack (and hence lost forever). Thus stack variables are
local in nature. This is related to a concept of variable/local
scope vs global variables.
A common bug in C programming is attempting to access a
variable that was created on the stack inside some function,
from a place in your program outside of that function (i.e.
after that function has exited).
there is a limit (varies with OS) on the size of variables that
can be stored on the stack. This is not the case for variables
allocated on the heap.

CSC 3112: Principles of Programming Languages 31/58


Memory : Stack vs Heap

Heap: a region of your computer’s memory that is not


managed automatically for you, and is not as tightly managed
by the CPU.
It is a more free-floating region of memory (and is larger).
To allocate memory on the heap, you must use malloc() or
calloc(), which are built-in C functions.
Once you have allocated memory on the heap, you are
responsible for using free() to deallocate that memory once
you don’t need it any more. If you fail to do this, your
program will have what is known as a memory leak.
memory leak: memory on the heap will still be set aside (and
won’t be available to other processes).

CSC 3112: Principles of Programming Languages 32/58


Memory : Heap

Unlike the stack, the heap does not have size restrictions on
variable size (apart from the obvious physical limitations of
your computer).
Heap memory is slightly slower to be read from and written
to, because one has to use pointers to access memory on the
heap.
Unlike the stack, variables created on the heap are accessible
by any function, anywhere in your program. Heap variables
are essentially global in scope.

CSC 3112: Principles of Programming Languages 33/58


Memory : Stack vs Heap

What are the pros and cons?


When should you use the heap, and when should you use the
stack?

CSC 3112: Principles of Programming Languages 34/58


Static Variables

Disadvantages (to storage):


reduced flexibility; a language that has only static variables
cannot support recursive subprograms
storage cannot be shared among variables. For example, if a
program has two subprograms, both of which require large
arrays. Furthermore, suppose that the two subprograms are
never active at the same time. If the arrays are static, they
cannot share the same storage for their arrays
C and C++: static specifier on a variable definition in a
function, making the variables it defines static.
static modifier in the declaration of a variable in a class
definition in C++, Java, and C#, implies that the variable is
a class variable, rather than an instance variable. Class
variables are created statically some time before the class is
first instantiated

CSC 3112: Principles of Programming Languages 35/58


Stack-dynamic variables

variables whose storage bindings are created when their


declaration statements are elaborated, but whose types are
statically bound
elaboration of a declaration refers to the storage allocation
and binding process indicated by the declaration, which takes
place when execution reaches the code to which the
declaration is attached. occurs during run time
Example: the variable declarations that appear at the
beginning of a Java method are elaborated when the method
is called and the variables defined by those declarations are
deallocated when the method completes its execution
are allocated from the run-time stack

CSC 3112: Principles of Programming Languages 36/58


Explicit Heap-Dynamic Variables

nameless (abstract) memory cells that are allocated and


deallocated by explicit run-time instructions written by the
programmer.
are allocated from and deallocated to the heap
can only be referenced through pointer or reference variables

CSC 3112: Principles of Programming Languages 37/58


Implicit Heap-Dynamic Variables

are bound to heap storage only when they are assigned values.
all their attributes are bound every time they are assigned

CSC 3112: Principles of Programming Languages 38/58


Question

What are the advantages and disadvantages of having:


implicit heap-dynamic variables
explicit heap-dynamic variables
stack dynamic variables
static variables

CSC 3112: Principles of Programming Languages 39/58


Scope

scope of a variable is the range of statements in which the


variable is visible.
a variable is visible in a statement if it can be referenced in
that statement
Local variables: a variable is local in a program unit or block
if it is declared there
Nonlocal variables: e.g. global variables. nonlocal variables
of a program unit or block are those that are visible within the
program unit or block but are not declared there

CSC 3112: Principles of Programming Languages 40/58


Static Scoping

binding names to nonlocal variables


the scope of a variable can be statically determined i.e prior to
execution
permits a human program reader (and a compiler) to
determine the type of every variable in the program simply by
examining its source code
2 categories of static-scoped languages:
those in which methods can be nested, which creates nested
static scopes
those in which methods cannot be nested
Ada, JavaScript, F#, and Python allow nested methods, but
the C-based languages do not
When the reader of a program finds a reference to a variable,
the attributes of the variable can be determined by finding the
statement in which it is declared (either explicitly or implicitly)

CSC 3112: Principles of Programming Languages 41/58


Nested Static Scopes

Suppose a reference is made to a variable x in method m1


The correct declaration is found by first searching the
declarations of method m1
If no declaration is found for the variable there, the search
continues in the declarations of the method that declared
method m1, which is called its static parent.
If a declaration of x is not found there, the search continues
to the next-larger enclosing unit (the unit that declared m1’s
parent), and so forth, until a declaration for x is found or the
largest unit’s declarations have been searched without success.
In that case, an undeclared variable error is reported.
The static parent of method m1, and its static parent, and so
forth up to and including the largest enclosing method, are
called the static ancestors of m1

CSC 3112: Principles of Programming Languages 42/58


Nested Static Scopes: Example

Given the JavaScript function big below, which x is referenced in


function sub2?

function big() {
function sub1() {
var x = 7;
sub2();
}

function sub2() {
var y = x;
}

var x = 3;
sub1();
}

CSC 3112: Principles of Programming Languages 43/58


Nested Static Scopes: Example

some variable declarations can be hidden from some other code


segments. e.g which x is referenced in sub1?

function big() {
function sub1() {
var x = 7;
sub2();
}

function sub2() {
var y = x;
}

var x = 3;
sub1();
}

CSC 3112: Principles of Programming Languages 44/58


Blocks

Many languages allow new static scopes to be defined in the


midst of executable code
allows a section of code to have its own local variables whose
scope is minimized.
the variables are stack dynamic, i.e their storage is allocated
when the section is entered and deallocated when the section
is exited. Such a section of code is called a block
C-based languages allow any compound statement (a
statement sequence surrounded by matched braces) to have
declarations and thus define a new scope. Such compound
statements are called blocks

CSC 3112: Principles of Programming Languages 45/58


Blocks: Example

if (list[i] < list[j]) {


int temp;
temp = list[i];
list[i] = list[j];
list[j] = temp;
}

CSC 3112: Principles of Programming Languages 46/58


Blocks: Example 2

void sub() {
int count;
. . .
while (. . .) {
int count;
count++;
. . .
}
. . .
}

CSC 3112: Principles of Programming Languages 47/58


Dynamic Scope

based on the calling sequence of subprograms not on their


spatial relationships with each other
scope can only be determined at runtime
Perl and Common LISP
what is the value of x in sub2 below?
function big() {
function sub1() {
var x = 7;
}
function sub2() {
var y = x;
var z = 3;
}
var x = 3;
}
CSC 3112: Principles of Programming Languages 48/58
Dynamic Scoping

One way the correct meaning of x can be determined during


execution is to begin the search with the local declarations.
This is also the way the process begins with static scoping, but
that is where the similarity between the two techniques ends.
When the search of local declarations fails, the declarations of
the dynamic parent, or calling function, are searched. If a
declaration for x is not found there, the search continues in
that function’s dynamic parent, and so forth, until a
declaration for x is found. If none is found in any dynamic
ancestor, it is a run-time error.

CSC 3112: Principles of Programming Languages 49/58


Dynamic Scope: Example

Given the JavaScript function big below, which x is referenced in


function sub2?

function big() {
function sub1() {
var x = 7;
sub2();
}

function sub2() {
var y = x;
}

var x = 3;
sub1();
}

CSC 3112: Principles of Programming Languages 50/58


Referencing Environments

referencing environment of a statement is a collection of


variables that are visible in the statement
In a static-scoped language, it is the local variables plus all of
the visible variables in all of the enclosing scopes.
The referencing environment of a statement is needed while
that statement is being compiled, so code and data structures
can be created to allow references to non-local vars in both
static and dynamic scoped languages.
a subprogram is active if its execution has begun but has not
yet terminated.
In a dynamic-scoped language, the referencing environment is
the local variables plus all visible variables in all active
subprograms

CSC 3112: Principles of Programming Languages 51/58


Referencing Environments Example 1

void sub1() {
int a, b;
. . . <------------ 1
} /* end of sub1 */
void sub2() {
int b, c;
. . . . <------------ 2
sub1();
} /* end of sub2 */
void main() {
int c, d;
. . . <------------ 3
sub2();
} /* end of main */

CSC 3112: Principles of Programming Languages 52/58


Referencing Environments Example 1

Consider the example program on the previous slide. Assume that


the only function calls are the following: main calls sub2, which
calls sub1. The referencing environments of the indicated program
points are as follows:

Point Referencing Environment


1 a and b of sub1, c of sub2, d of main,
(c of main and b of sub2 are hidden)
2 b and c of sub2, d of main, (c of main is hidden)
3 c and d of main

CSC 3112: Principles of Programming Languages 53/58


Referencing Environments Example 2

Consider the following Python skeletal program. What are the


referencing environments at points 1, 2, and 3?

g = 3; # A global
def sub1():
a = 5; # Creates a local
b = 7; # Creates another local
. . . <------------------------------ 1
def sub2():
global g; # Global g is now assignable here
c = 9; # Creates a new local
. . . <------------------------------ 2
def sub3():
nonlocal c: # Makes nonlocal c visible here
g = 11; # Creates a new local
. . . <------------------------------ 3

CSC 3112: Principles of Programming Languages 54/58


Referencing Environments Example 2

Point Referencing Environment


1 local a and b (of sub1), global g for reference,
but not for assignment
2 local c (of sub2), global g for both reference
and for assignment
3 nonlocal c (of sub2), local g (of sub3)

CSC 3112: Principles of Programming Languages 55/58


Named Constants

variables that are bound to values only once


its value can not be change by assignment or by an input
statement.

CSC 3112: Principles of Programming Languages 56/58


Variable Initialization

The binding of a variable to a value at the time it is bound to


storage is called initialization.
Initialization is often done on the declaration statement.
e.g., Java
int sum = 0;

CSC 3112: Principles of Programming Languages 57/58


Exercise

In your discussion groups, try out the Review Questions,


Problem Set and Programming Exercises at the end of
Chapter 5.
Write test programs in Java, Python, and C to determine the
scope of a variable declared in a for statement. Specifically,
the code must determine whether such a variable is visible
after the body of the for statement.

CSC 3112: Principles of Programming Languages 58/58

You might also like