Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

Language Fundamentals-I

4
(Character set, keywords, identifiers, constants, variables)

Chapter Outline
“First, master the fundamentals.”
–Larry Bird 1. Introduction
2. Character Set
3. Tokens
1. Keywords
2. Identifiers
3. Literals
4. Data types
“I long to accomplish great and noble task, but 5. Variables
it is my chief duty to accomplish small tasks as
6. Type qualifiers
if they were great and noble.”
–Helen Keller 4. Conclusion

“Success is neither magical nor mysterious. Success


is the natural consequence of consistently applying
the basic fundamentals.”
–Jim Rohn
1. Introduction

Alphabets Words Sentences Paragraphs Stories

Steps in learning English Language

Characterset Toke ns Instructions Functions Programs

Steps in learning C Language

 Character Set: A character denotes any alphabet, digit, white space or any special symbol that is
used to represent information. A character set is collection of characters.
 Token: A token is the smallest individual unit of a program.
 Instruction: An instruction is a statement that is given to computer to perform a specific
operation.
 Function: A function is a collection of instructions that performs a particular task.
 Program: A program is a well-organized collection of instructions that is used to communicate with
the computer system to accomplish desired objective.

2. Character Set
When we wish to write a program, we write it as a collection of text lines containing characters from a
collection of characters. This collection can be called as character set. A C program can be written using
the following character set:

Alphabets: abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Digits: 0123456789
Special Symbols:
Symbol Meaning Symbol Meaning Symbol Meaning
{ Opening curly ‘ apostrophe ^ Caret or exclusive
brace OR
} Closing curly brace “ Double quotation & Ampersand
mark
( Opening ~ Negation or tilde * Asterisk
parenthesis
) Closing ! Exclamation + Plus
parenthesis
[ Opening square # Pound or number - Minus or hyphen
bracket or hash
] Closing square % mod / Forward slash
bracket
. Dot or period ; Semi-colon \ Backward slash
? Question Mark : Colon > Greater than
| Pipe , Comma < Lesser than
_ Underscore = Assigns to
White space characters: blank space, horizontal tab, new line,
carriage return, vertical tab, form feed
3. Tokens
A token is the smallest individual unit (or element) of a program. The tokens used in a program are:
 Keywords.
 Identifiers.
 Literals (constants).
 Variables.
 Operators.

1. Keywords
Each language comes with a set of words. As these words play key role in developing a program, these
are often termed as keywords.

Keywords are the built-in words whose meanings are already explained to compiler.

Keywords are the pre-defined or built-in words. Each keyword has its own definition that is defined by the
language developers. A C compiler can recognize keyword and replaces its definition whenever it is
needed. Keywords also called as reserved words. Each keyword has its own purpose and it should be used
only for that purpose. There are 3 types of keywords:
Keywords

Type-related Storage-related Control-flow


keywords keywords related keywords

int auto if
float static else
char extern switch
double register default
long case
short while
signed do
unsigned for
void break
struct continue
union goto
typedef return
sizeof
enum
const
volatile

It is important to note that all the keywords should be in lowercase. Some compilers may also include
some or all of the following keywords:
ada asm entry far
fortran huge near pascal
1. Which of the following are keywords in C?
a) int b) register
c) switch d) boolean
Q
U
2. Which of the following is a keyword in C?
E a) Int b) int
c) Integer d) integer
S
T
I 3. Which of the following is not a keyword in C?
a) volatile b) enum
O c) constant d) sizeof
N
S

4.3.2. Identifiers
As pre-defined names (i.e., keywords) are needed to develop a C program, User-defined names also
needed. These user-defined names are called as identifiers.

Identifiers are the names given to various program elements such as variables, constants,
arrays, functions, pointers…etc.
These names will be given by the user as and when needed. Giving meaningful identifiers makes program
easy to understand. To define an identifier one should follow these rules:

Rule #1: An identifier should not be a keyword.


Ex: salary, name, GOTO (valid)
int, goto (invalid)

Rule #2: The first character of identifier must be a letter or an underscore (_). The subsequent
characters may be alphabets, digits or underscore. Special symbols are not allowed.
Ex: pradeep K123 Ravi_varma _abc (valid)
1Raj (invalid)
Rule #3: No special symbol is used except underscore (_). No spaces are allowed in an identifier.
Ex: gross_sal Rect_area (valid)
Gross salary s.i. profit&loss (invalid)
Rule #4: Upper and lower case letters in an identifier are distinct (or different).
Ex: The names amount, Amount, aMOUnt and AMOUNT are not the same identifiers.

Rule #5: An identifier can be arbitrarily long. Some implementations of C recognize only the first eight
characters, though most compilers recognize more (typically, 31 characters).
1) Which of the following are valid and invalid identifiers? Give reasons if not valid.
Q 1) record1 2)$tax 3)name 4)name-and-address 5) 1record
6) name and address 7) name_and_address 8) 123-45-6789
U 9) return 10)file_3 11)_master 12)_123 13) Ravi&Bro.
E
S 2) Assume that your C compiler recognizes only first 8 characters of an identifier.
Which of the following are valid and invalid identifiers?
T 1) Master_minds 2)char 3)s.i. 4) SimpleInterest 5)string 6)char1
I 7) identifier_1 8)ANSWER 9)answer 10)number#1

O
N
S

4.3.3. Constants (or) Literals


A program needs some input to be processed. During processing instructions, the input should be stored
in memory. The input that is being stored in memory called as a literal.

A constant or literal is a value that is being input by the user to a program. The value may
be a character, a string, an integer or a floating-point number.

There are two types of constants: Numeric constants and non-numeric constants. As the names
imply that numeric constant is collection of digits and non-numeric constant is collection of characters
from character set.
Constants

Numeric Non-numeric
Constants Constants

Integer Real Character String


constants constants constants constants

Note: The word “constant” in C has two meanings:


1. The value that remains unchanged (or fixed) during the execution of program.
2. The value that is being input to a program.
1. Integer constants
An integer constant (either positive or negative) is taken to be:
1. Decimal integer constant: if it consists of digits 0-9.
E.g., 98334 -3456 are valid decimal integer constants.
2. Octal integer constant: if it begins with 0 (digit 0) and should not contain 8 and 9.
E.g., 0534 035 are valid octal integer constants.
3. Hexa-decimal integer constant: if the sequence of digits should be preceded by 0x (or) 0X and
should hold the values from 0-9 (or) A-F (a-f).
E.g., 0xFACE 0X124c are valid hexa-decimal integer constants.

An integer constant may be suffixed by the letter u (or) U, to specify that it is unsigned (only positive). It
may also be suffixed by the letter l or L to specify that it is long (big integer). In the absence of any
suffixes, the data type of an integer constant is derived from its value.
Examples of integer constants:
Integer constant Description
5000U Unsigned decimal integer constant
123456789L Long decimal integer constant
0235353l Long octal integer constant
0x23FA3dU Unsigned hexa decimal integer constant
0XFFFFFFFUL Unsigned long hexa-decimal integer constant
0243UL Unsigned long octal integer constant
123245353UL Unsigned long decimal integer constant

2. Floating-point constants or Real constants


A floating-point constant can be expressed in any one of these two notations:

Decimal notation: In this notation, the floating-point number is represented as a whole number followed
by a decimal point and a fractional part. It is possible to omit digits before or after the decimal point. A
floating-point constant can include one of the suffixes: f, F or l, L.

Examples of floating-point constants: (Decimal notation)


Real constant Description
Precision=
2.3456 Double-precision floating-point constant. (by default)
digits after
2.3456F Single-precision floating-point constant. a decimal
point.
2.3456L Long double precision floating-point constant.
Exponential notation: Exponential notation is useful in representing numbers whose magnitudes are
very large or very small. The exponential notation consists of a mantissa, e or E and an exponent. The
mantissa is either an integer or a real number expressed in decimal notation. A mantissa can be preceded
by a sign (+ or -). The exponent is an integer preceded by an optional sign.

Examples of floating point constants: (Exponential notation)


Number In powers of 10 Exponential notation
53876 5.3876*104 5.3876e4
0.00000000004 4*10-11 4E-11
100000 1*105 1e+5
0.007321 7.321*10-3 7.321E-3
32000 3.2*104 3.2E4
0.0000005 0.5*10-6 0.5E-6

Note: It should be understood that integer constants are exact quantities; where as floating-point
constants are approximations. We should understand that the floating-point constant 1.0 might be
represented within computer’s memory as 0.99999999….., even though it might appear as 1.0 when it is
displayed on the screen (because of automatic rounding). Therefore, floating-point values can not be used
for certain purposes, such as counting, indexing…etc, where the exact values are required.

3. Character constants
A character constant is a sequence of one or more characters enclosed in single quotes. The character
may be an alphabet, digit, special symbol or a blank space. The value of a character constant with only
one character is the numeric values of the character in the machine’s character set at execution time. The
value of multi-character constant is implementation-defined.

Ex: ‘a’ ‘9’ ‘@’ ‘\0’ (valid)


‘abc’ ‘123’ ‘a&b’ ‘’’(invalid)

It is important to note that character constants do not contain the ‘ (single quote character) or new line
within it. In order to represent these and certain other characters, the following escape sequences (or
backslash character constants) may be used:
Backslash Description
character
constant
\n New line
\t Horizontal tab
\v Vertical tab
\b Back space
\r Carriage return
\f Form feed
\a Audible alert (bell)
\\ Backslash
\? Question mark
\’ Single quote
\’’ Double quote
\000 Octal number
\xhh Hexa-decimal number
The escape sequence \000 consists of the backslash followed by 1, 2 or 3 octal digits which are taken to
specify the value of a desired character. A common example of this construction is \0 (not followed by any
digit), which specifies the character NUL.
The escape sequence \xhh consists of backslash followed by x, followed by hexa-decimal digits, which are
taken to specify the value of the desired character. There is no limit on the number of digits, but the
behavior is undefined if the resulting character value exceeds that of largest character.

4. String constants
A string constant is a sequence of characters surrounded by double quotes. The characters may be of
letters, numbers, escape sequences and spaces. In C, a string can be represented as an array of
characters terminated by a null character (\0).
Ex: “234” “Civil \n Engineering” “Rama&Co.” (valid)
“”” (invalid)
Note:
1) A string constant never contain the characters: “ (double quotation mark), new line. To include
this, one should use their corresponding escape sequences.
2) Adjacent string literals are concatenated into a single string. After concatenation, a null byte \0 is
appended to the string so that program that reads the string can find its end.
Interview question #1
What is the difference among 1, ‘1’ and “1”?
1 is a decimal integer constant that occupies 2 or 4 bytes based on execution environment (i.e, on
processor and compiler).
‘1’ is a character constant that occupies 1 byte containing the ASCII code of the character 1.
“1” is a string constant that occupies 2 bytes; one byte containing ASCII code of character 1 and one
byte for null character with value 0 that shows end of string.

1) Which of the following are valid and invalid Integer constants? Give reasons if not
valid.
1) 123.34 2) 0893 3)-2345 4)0x123 5)3458UL 6)2345l 7)0124 8)0XFAGE
Q
2) Which of the following are valid and invalid floating-point constants? Give reasons
U if not valid.
E 1) -934 2) 0345 3)-89.34 4)9E+3 5)67.84L 6)89.342f 7)0.3E-4 8)89. 9).89

S 3) Which of the following are valid and invalid character constants? Give reasons if not
valid.
T 1) ‘a’ 2) ‘{‘ 3)’0’ 4)’ ‘ ‘ 5)’\m’ 6)’\023’ 7)’\x3456’ 8)’,’ 9)’134.3’ 10)’435’
I
4) Which of the following are valid and invalid string constants? Give reasons if not
O valid.
N 1) “Master minds” 2) “234-567-466” 3)’”King & queen” 4)”C” ”is brilliant”
5)”he told-“ I miss you”” 6)”Ravi’s friend”
S

2. Data types
Data type is a classification or category of various types of data that states the possible
values that can be taken, how they are stored and what operations are allowed on them.

In simple terms, data type is a set of values and operations on those values.

4.3.4.1. Primitive data types: There are 5 basic data types in C. The size and range of each of these
data types may vary among processor types and compilers. The following table shows the primitive data
types in C:
Data type Size (in bytes) Range

int 2 bytes or one word (varies from one -32768 to +32767


compiler to another).
float 4 bytes or one word. -3.4e38 to +3.4e38 with 6
digits of precision.
double 8 bytes or two words. -1.7e308 to +1.7e308 with
10 digits of precision
char 1 byte. -128 to +127

void 0 bytes. Valueless


Type modifiers: Except the data type void, the primitive data types may have various type modifiers
preceding them. Type modifiers are the keywords that are used to modify the behavior of existing
primitive data types. There are two types of modifiers:
 Size modifiers: These type modifiers modify the number of bytes a primitive data type occupies.
Based on size, the maximum and minimum values, a primitive data type specifies, will be changed.
The size modifiers include: long and short.
 Sign modifiers: These type modifiers modify the sign of a primitive data type. The sign modifiers
include: signed and unsigned.
Size modifiers: A compiler can decide appropriate sizes depending on operating system and hardware for
which it is being written, subject to following rules:
a) shorts are atleast 2 bytes long.
b) longs are atleast 4 bytes long.
c) shorts are never bigger than ints.
d) ints are never bigger than longs.
compiler short int long

16-bit 2 2 4

32-bit 2 4 4

Sign modifiers: if unsigned type modifier is preceding a primitive data type, then the variables of the
specified type accept only positive values. If signed type modifier is preceding a primitive data type, then
the variables of specified type accept both positive and negative values.
The following table specifies various data types including type modifiers: (16-bit compiler)

Data type Size (in Range


bytes)
char / signed char 1 -128 to +127
unsigned char 1 0 to 255
int / signed int / short int/ 2 -32768 to +32767
signed short int
unsigned int / unsigned short 2 0 to 65535
int
long int / signed long int 4 -2,147,483,648 to +2,147,483,647
unsigned long int 4 0 to 4,294,967,295
float 4 -3.4e38 to +3.4e38 with 6 digits of precision.
double 8 -1.7e308 to +1.7e308 with 10 digits of precision.
long double 10 -1.7e4932 to +1.7e4932 with 10 digits of precision.
4.3.4.2. User-defined data types: These are the data types defined by the user according to his needs.
These data types will be defined by using primitive data types. The user-defined data types include:
struct, union, enum.
4.3.5. Variables
The value that is being input to a program will be held by some entity known as a variable. This variable
associates with locations in memory, based on the type of input. E.g., if input is a floating-point value,
then the variable of type float associates with 4 bytes. Each of these bytes has its associated address.
However, it is a good idea to name these locations by avoiding the headache of remembering addresses.
Therefore,

A variable is a named location in memory that holds a value and that value may be varied during
execution of a program.

Ex: f=1.8*c+32
In this formula, 1.8 and 32 are fixed values means that they don’t change each time. Each time the values
of f and c are changed. Hence, f and c will be treated as variables.

Declaring a variable: (Declarative instruction)


All variables should be declared before we use them in the program. The variable declaration tells the
compiler two things:
1. What the name(s) of variables are?
2. Where the values are being stored?
Usually, the declarative instruction is written as a first statement before all the executable statements.
The declarative instruction has the following syntax:

[Storage class] <data type> <variable name(s)>;

In this syntax,
The content in square brackets is optional. The content in angle brackets is mandatory. There should be
spaces in between. The declarative instruction should always be ended with a semi-colon.
 The storage class specifies the default value a variable(s) holds, storage location of variable(s),
scope and life time of variable(s). These include: auto, extern, register, static.
 The data type is a keyword that specifies the type of data that is being hold by the variable(s).
 The variable name is any legal identifier. In other words, it should be built based on the rules of
identifier. If there are more than one variable of the same type, then separate them with commas.

Ex: int a,b,c; //a,b and c are integer variables


long double m; // m is a long double variable.
char sex; // sex is a character variable
char name[20]; //name is a character array that can hold 19 characters.
Initializing a variable: Initializing a variable is the process of assigning a value to the variable. The
initialization can be done as follows:

[Storage class] <data type> <variable name>=<value>; ….(1)


(or)
<variable Name>=<value>; ….(2)

In these two syntaxes, we observe an operator, i.e., assignment operator (=), which is used to assign a
value of Right operand to Left operand. In the second syntax, the variable name should be declared
earlier.
Ex: int a=20; is equivalent to
int a;
a=20;

While initializing a variable, one should observe this:


If the variable and assigned values are of different types, then the assigned value will be converted to
type of variable.
E.g., float a=20;
In this example, a is floating-point variable and 20 is an integer. When this initialization is carried out, the
variable a holds 20.000000, a floating-point value.

1. Write appropriate declarations for each group of variables and arrays


a) Integer variables: p,q
Floating-point variables: x,y,z
Character variables: a,b,c
b) Long integer variable: counter
Q Short integer variable: flag
U Unsigned integer variable: cust_no

E c) Double-precision variables: gross, tax, net

S 80-element character array: message

T 2. Write the declarations of various variables required for calculating simple interest?
3. Write appropriate initialization statements for these: (initial value)
I
a) Integer variable: a (120)
O
Floating point variable: x (32.34f)
N
b) character variable: c (‘p’)
S
15-element character array: name (“pradeep”)
4. Write appropriate initialization statements for your details (rollno, name, age, sex,
fees)?

1. Type qualifiers
Type qualifiers are the keywords that add new meanings to existing data types. There are two type
qualifiers: const, volatile.
 Making a variable as read-only variable: In order to make the value of variable as unchanged
during the execution of a program, initialize the variable with the type qualifier const as follows:

const [Storage class] <data type> <variable name>=<value>;

Ex: const double PI=3.1412;


This initialization tells the compiler that the value of PI must not be modified by the program.
However, it can be used on the right hand side of assignment statement like other variable.
Ex: double x;
x=PI;
 Making a variable as modifiable externally: In order to make variable’s value modifiable at any
time by some external sources (from outside program), we use type qualifier volatile. For example,
volatile int x;
The value of x may be altered by some external factors even if it does not appear on the
left-hand side of an assignment statement. When we declare a variable as volatile, the compiler
will examine the value of the variable each time it is encountered to see whether any external
alteration has changed the value.

2. Conclusion
Every C program is typically a collection of functions. A function is a collection of instructions that perform
a specific task. Some of instructions in functions made up of words and characters. These are collectively
known as tokens. Hence, tokens are the smallest individual units of a program.

You might also like