Professional Documents
Culture Documents
Oracle 1
Oracle 1
• Overview
• Logging In to Oracle
• Changing Your Password
• Creating a Table
• Creating a Table With a Primary Key
• Inserting Tuples
• Getting the Value of a Relation
• Getting Rid of Your Tables
• Getting Information About Your Database
• Quitting sqlplus
• Executing SQL From a File
• Editing Commands in the Buffer
• Recording Your Session
• Help Facilities
Overview
You will be using the Oracle database system to implement your PDA (Personal Database
Application) this quarter. Important: As soon as your Oracle account is set up, you should log
in to change the initial password.
Logging In to Oracle
You should be logged onto one of the Leland Systems Sun Solaris machines. These machines
include elaine, saga, myth, fable, and tree.
Before using Oracle, execute the following line in your shell to set up the correct environment
variables:
source /afs/ir/class/cs145/all.env
You may wish to put this line in your shell initialization file instead (for example, .cshrc).
Now, you can log in to Oracle by typing:
sqlplus <yourName>
Here, sqlplus is Oracle's generic SQL interface. <yourName> refers to your leland login.
You will be prompted for your password. This password is initially changemesoon and must be
changed as soon as possible. For security reasons, we suggest that you not use your regular
leland password, because as we shall see there are opportunities for this password to become
visible under certain circumstances. After you enter the correct password, you should receive the
prompt
SQL>
Creating a Table
In sqlplus we can execute any SQL command. One simple type of command creates a table
(relation). The form is
CREATE TABLE <tableName> (
<list of attributes and their types>
);
You may enter text on one line or on several lines. If your command runs over several lines, you
will be prompted with line numbers until you type the semicolon that ends any command.
(Warning: An empty line terminates the command but does not execute it; see Editing
Commands in the Buffer.) An example table-creation command is:
CREATE TABLE test (
i int,
s char(10)
);
This command creates a table named test with two attributes. The first, named i, is an integer,
and the second, named s, is a character string of length (up to) 10.
Inserting Tuples
Having created a table, we can insert tuples into it. The simplest way to insert is with the INSERT
command:
INSERT INTO <tableName>
VALUES( <list of values for attributes, in order> );
For instance, we can insert the tuple (10, 'foobar') into relation test by
INSERT INTO test VALUES(10, 'foobar');
Quitting sqlplus
To leave sqlplus, type
quit;
in response to the SQL> prompt.
Help Facilities
SQL*Plus provides internal help facilities for SQL*Plus commands. No help is provided for
standard SQL keywords. To see a list of commands for which help is available, type help
topics or help index in response to the SQL> prompt. To then look up help for a particular
keyword (listed in the index), type help followed by the keyword. For example, typing help
accept will print out the syntax for the accept command.
The output from help, and in general, the results of many SQL commands, can be too long to
display on a screen. You can use
set pause on;
to activate the paging feature. When this feature is activated, output will pause at the end of each
screen until you hit the "return" key. To turn this feature off, use
set pause off;
This document was written originally for Prof. Jeff Ullman's CS145 class in Autumn, 1997; revised by Jun Yang for Prof. Jennifer Widom's CS145 class in
Spring, 1998; further revisions by Jeff Ullman, Autumn, 1998; further revisions by Jennifer Widom, Spring 2000; further revisions by Nathan Folkert,
Spring 2001; further revisions by Jim Zhuang, Summer 2005.
Overview
To use the Oracle bulk loader, you need a control file, which specifies how data
should be loaded into the database; and a data file, which specifies what data
should be loaded. You will learn how to create these files in turn.
LOAD DATA
INFILE <dataFile>
APPEND INTO TABLE <tableName>
FIELDS TERMINATED BY '<separator>'
(<list of all attribute names to load>)
• <dataFile> is the name of the data file. If you did not give a file name
extension for <dataFile>, Oracle will assume the default extension ".dat".
Therefore, it is a good idea to name every data file with an extension, and
specify the complete file name with the extension.
• <tableName> is the name of the table to which data will be loaded. Of course,
it should have been created already before the bulk load operation.
• The optional keyword APPEND says that data will be appended to <tableName>.
If APPEND is omitted, the table must be empty before the bulk load operation
or else an error will occur.
• <separator> specifies the field separator for your data file. This can be any
string. It is a good idea to use a string that you know will never appear in the
data, so the separator will not be confused with data fields.
• Finally, list the names of attributes of <tableName> that are set by your data
file, separated by commas and enclosed in parentheses. This list need not be
the complete list of attributes in the actual schema of the table, nor must it
be arranged in the same order as the attributes when the table was created
-- sqlldr will match attributes to by their names in the table schema. Any
attributes unspecified in the list of attributes will be set to NULL.
As a concrete example, here are the contents of a control file test.ctl:
LOAD DATA
INFILE test.dat
INTO TABLE test
FIELDS TERMINATED BY '|'
(i, s)
1|foo
2|bar
3| baz
Recall that the attribute list of test specified in test.ctl is (i, s), where i has the
type int, and s has the type char(10). As the result of loading test.dat, the
following tuples are inserted into test:
(1, 'foo')
(2, 'bar')
(3, ' baz')
As a concrete example, if sally wishes to run the control file test.ctl and have the log
output stored in test.log, then she should type
sqlldr sally control=test.ctl log=test.log
Reminder: Before you run any Oracle commands such as sqlldr and
sqlplus, make sure you have already set up the correct environment by
sourcing /afs/ir/class/cs145/all.env (see Getting Started With Oracle).
LOAD DATA
INFILE *
INTO TABLE test
FIELDS TERMINATED BY '|'
(i, s)
BEGINDATA
1|foo
2|bar
3| baz
The trick is to specify "*" as the name of the data file, and use BEGINDATA to
start the data section in the control file.
3||5
|2|4
1||6
||7
would result in inserting the following tuples in the relation:
(3, NULL, 5)
(NULL, 2, 4)
(1, NULL, 6)
(NULL, NULL, 7)
Keep in mind that any primary keys or other constraints requiring that values
be non-NULL will reject tuples for which those attributes are unspecified.
Note:If the final field in a given row of your data file will be unspecified (NULL), you
have to include the line TRAILING NULLCOLS after the FIELDS TERMINATED BY line in
your control file, otherwise sqlldr will reject that tuple. sqlldr will also reject a tuple
whose columns are all set to NULL in the data file.
If you do not wish to enter values for any row of a given column, you can, as mentioned
above, leave that column out of the attribute list altogether.
This document was written originally for Prof. Jeff Ullman's CS145 class in Autumn, 1997; revised by Jun Yang for
Prof. Jennifer Widom's CS145 class in Spring, 1998; further revisions by Jeff Ullman, Autumn, 1998; further
revisions by Srinivas Vemuri for Prof. Jeff Ullman's CS145 class in Autumn, 1999; further revisions by Nathan
Folkert for Prof. Jennifer Widom's CS145 class in Spring, 2001. Further revisions by Wang Lam for Prof. Widom's
CS145 class in Spring, 2003.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and
Jennifer Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Oracle does not support AS in FROM clauses, but you can still specify tuple variables
without AS:
The set-difference operator in Oracle is called MINUS rather than EXCEPT. There is no
bag-difference operator corresponding to EXCEPT ALL. The bag-intersection operator
INTERSECT ALL is not implemented either. However, the bag-union operator UNION
ALLis supported.
In Oracle, you must always prefix an attribute reference with the table name
whenever this attribute name appears in more than one table in the FROM clause. For
example, suppose that we have tables R(A,B) and S(B,C). The following query does
not work in Oracle, even though B is unambiguous because R.B is equated to S.B in
the WHERE clause:
In Oracle, the negation logical operator (NOT) should go in front of the boolean
expression, not in front of the comparison operator. For example, "NOT A = ANY
(<subquery>)" is a valid WHERE condition, but "A NOT = ANY (<subquery>)" is not.
(Note that "A <> ANY (<subquery>)" is also a valid condition, but means something
different.) There is one exception to this rule: You may use either "NOT A IN
(<subquery>)" or "A NOT IN (<subquery>)".
Comments
In Oracle, comments may be introduced in two ways:
1. With /*...*/, as in C.
2. With a line that begins with two dashes --.
Thus:
-- This is a comment
SELECT * /* and so is this */
FROM R;
Data Types
BIT type is not supported. There is a BOOLEAN type in PL/SQL (see Using Oracle
PL/SQL for details), but it cannot be used for a database column.
Domains (i.e., type aliases) are not supported.
Dates and times are supported differently in Oracle. For details, please refer to
Oracle Dates and Times, available from the class web page.
Indexes
Oracle automatically creates an index for each UNIQUE or PRIMARY KEY declaration.
For example, if you create a table foo as follows:
Views
Oracle supports views as specified in SQL. To find out what views you have created,
use:
Constraints
Oracle supports key constraints as specified in SQL. For each table, there can be
only one PRIMARY KEY declaration, but many UNIQUE declarations. Each PRIMARY KEY
(or UNIQUE) declaration can have multiple attributes, which means that these
attributes together form a primary key (or a key, respectively) of the table.
Oracle supports referential integrity (foreign key) constraints, and allows an optional
ON DELETE CASCADE or ON DELETE SET NULL after a REFERENCES clause in a table
declaration. However, it does not allow ON UPDATE options.
Note that when declaring a foreign key constraint at the end of a table declaration it is always
necessary to put the list of referencing attributes in parentheses:
create table foo (...
foreign key (<attr_list>) references (<attr_list>));
Oracle supports attribute- and tuple-based constraints, but does not allow CHECK
conditions to use subqueries. Thus, there is no way for an attribute- or tuple-based
constraint to reference anything else besides the attribute or tuple that is being
inserted or updated.
Domain constraints are not supported since domains are not supported.
In the ALTER TABLE statement, Oracle supports ADDing columns and table constraints,
MODIFYing column properties and column constraints, and DROPping constraints.
However, you cannot MODIFY an attribute-based CHECK constraint. Here are some
examples:
create table bar (x int, y int, constraint XYcheck check (x > y));
alter table bar add (z int, w int);
alter table bar add primary key (x);
alter table bar add constraint YZunique unique (y, z);
alter table bar modify (w varchar(2) default 'AM'
constraint Wnotnull not null);
alter table bar add check (w in ('AM', 'PM'));
alter table bar drop constraint YZunique;
alter table bar drop constraint XYcheck;
alter table bar drop constraint Wnotnull;
alter table bar drop primary key cascade;
Dropping constraints generally requires knowing their names (only in the special
case of primary or unique key constraints can you drop them without specifying
their names). Thus, it is always a good idea to name all your constraints.
Triggers
Triggers in Oracle differ in several ways from the SQL standard. Details are in a
separate section Constraints and Triggers.
Transactions
COMMIT makes permanent any database changes you made during the current transaction. Until
you commit your changes, other users cannot see them. ROLLBACK ends the current transaction
and undoes any changes made since the transaction began.
After the current transaction has ended with a COMMIT or ROLLBACK, the first executable SQL
statement that you subsequently issue will automatically begin another transaction.
For example, the following SQL commands have the final effect of inserting into table R the
tuple (3, 4), but not (1, 2):
insert into R values (1, 2);
rollback;
insert into R values (3, 4);
commit;
During interactive usage with sqlplus, Oracle also supports an AUTOCOMMIT option.
With this option set to ON each individual SQL statement is treated as a transaction
an will be automatically commited right after it is executed. A user can change the
AUTOCOMMIT option by typing
SET AUTOCOMMIT ON
or
SHOW ALL
a user can see the current setting for the option (including other ones).
The same rules for designating the end of a transaction (COMMIT/ROLLBACK) and the
beginning of it (which is implied and starts just after the last COMMIT/ROLLBACK) apply to
programmers interacting with Oracle using Pro*C or JDBC. Note though that Pro*C doesn't
support the AUTOCOMMIT option whereas JDBC does and it has a default AUTOCOMMMIT
option set to ON. Thus a programmer needs to execute COMMIT/ROLLBACK statements in
Pro*C whereas in JDBC a user can make use of the AUTOCOMMIT and never specify
explicitly where a transaction starts or ends. For more details, see the respective sections: Pro*C,
JDBC.
Oracle also supports the SAVEPOINT command. The command SAVEPOINT <sp_name>
establishes a savepoint named <sp_name> which marks the current point in the
processing of a transaction. This savepoint can be used in conjunction with the
command ROLLBACK TO <sp_name> to undo parts of a transaction.
For example, the following commands have the final effect of inserting into table R tuples (5, 6)
and (11, 12), but not (7, 8) or (9, 10):
insert into R values (5, 6);
savepoint my_sp_1;
insert into R values (7, 8);
savepoint my_sp_2;
insert into R values (9, 10);
rollback to my_sp_1;
insert into R values (11, 12);
commit;
Oracle automatically issues an implicit COMMIT before and after any SQL DDL (Data
Definition Language) statement (even if this DDL statement fails) .
Oracle provides a TIMING command for measuring the running time of SQL
commands. To activate this feature, type
In nested if-statements, PL/SQL uses ELSIF, while PSM calls for ELSEIF. Both are
used where we would find ELSE IF in C, for example.
To leave a loop, PL/SQL uses EXIT, or EXIT WHEN(...) to exit conditionally. PSM uses
LEAVE, and puts the leave-statement in an if-statement to exit conditionally.
Object-Relational Features
There is a great deal of difference between the Oracle and SQL-standard
approaches to user-defined types. You should look at the on-line guide Object
Relational Features of Oracle for details and examples of the Oracle approach.
However, here are a few small places where the approaches almost coincide but
differ in small ways:
When defining a user-defined type, Oracle uses CREATE TYPE ... AS OBJECT, while
the word ``OBJECT'' is not used in the standard.
To define (not declare) a method, Oracle has you write the code for the method in a
CREATE TYPE BODY statement for the type to which the method belongs. The
standard uses a CREATE METHOD statement similar to the way functions are defined in
PL/SQL or SQL/PSM.
This document was written originally for Prof. Jeff Ullman's CS145 class in Autumn, 1997; revised by Jun Yang for Prof.
Jennifer Widom's CS145 class in Spring, 1998; further revisions by Jun Yang, Spring 1999; further revisions by Jennifer
Widom, Spring 2000; minor revisions by Nathan Folkert, Spring 2001; Henry Hsieh, Autumn 2001; and Antonios Hondroulis,
Spring 2002; further revisions by Wang Lam for Prof. Jennifer Widom's CS145 class in Spring 2003.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and
Jennifer Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
• HAVING Clauses
• Views
• Intersection and Set-Difference
• ANY and ALL
HAVING Clauses
mySQL has a very limited form of HAVING clause. Instead of evaluating the HAVING
condition within each group, mySQL treats HAVING as a selection on the output
tuples. Thus, you can only refer in the HAVING clause to attributes that appear in
the SELECT clause. Recent versions of mySQL allow you to refer to aggregates in
the SELECT clause by their formula [e.g., AVG(salary)] rather than by an alias
established in the SELECT clause by (e.g.) AVG(salary)AS avgSalary.
Views
mySQL does not support views. However, unlike some other SQL implementations,
mySQL does support fully nested subqueries in the FROM clause. These subqueries
can serve as views in many situations, although they do not provide the ability of a
view to serve as a macro, with its definition reused in many queries.
SELECT DISTINCT *
FROM R
WHERE EXISTS (SELECT * FROM S WHERE R.a = S.a AND R.b = S.b);
To get the set difference, here is a similar approach using a subquery:
SELECT DISTINCT *
FROM R
WHERE NOT EXISTS (SELECT * FROM S WHERE R.a = S.a AND R.b = S.b);
Note that both these expressions eliminate duplicates, but that is in accordance with the SQL
standard.
> Correc
(1)
= t
< Corre
(1)
= ct
Corre Correc
=
ct t
< Corre
(1)
> ct
(1) mySQL gives an incorrect result, which in each of these cases is the same as what the other
of ANY and ALL gives.
(2) mySQL gives an incorrect result for both ANY and ALL. For each operator, the result is the
same independent of whether ANY or ALL is used. For <, the result is several tuples with low,
but different prices, and for > it is the other tuples in the relation Sells, i.e., some of the tuples
with high, but different prices.
This document was written originally by Jeff Ullman in the Winter of 2004.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and Jennifer
Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Overview
Oracle supports both date and time, albeit differently from the SQL2 standard. Rather than using
two separate entities, date and time, Oracle only uses one, DATE. The DATE type is stored in a
special internal format that includes not just the month, day, and year, but also
the hour, minute, and second.
The DATE type is used in the same way as other built-in types such as INT. For example, the
following SQL statement creates a relation with an attribute of type DATE:
create table x(a int, b date);
DATE Format
When a DATE value is displayed, Oracle must first convert that value from the special internal
format to a printable string. The conversion is done by a function TO_CHAR, according to a DATE
format. Oracle's default format for DATE is "DD-MON-YY". Therefore, when you issue the query
select b from x;
you will see something like:
B
---------
01-APR-98
Whenever a DATE value is displayed, Oracle will call TO_CHAR automatically with the default
DATE format. However, you may override the default behavior by calling TO_CHAR explicitly with
your own DATE format. For example,
SELECT TO_CHAR(b, 'YYYY/MM/DD') AS b
FROM x;
returns the result:
B
---------------------------------------------------------------------------
1998/04/01
The general usage of TO_CHAR is:
TO_CHAR(<date>, '<format>')
where the <format> string can be formed from over 40 options. Some of the more popular ones
include:
Operations on DATE
You can compare DATE values using the standard comparison operators such as =, !=, >, etc.
You can subtract two DATE values, and the result is a FLOAT which is the number of days between
the two DATE values. In general, the result may contain a fraction because DATE also has a time
component. For obvious reasons, adding, multiplying, and dividing two DATE values are not
allowed.
You can add and subtract constants to and from a DATE value, and these numbers will be
interpreted as numbers of days. For example, SYSDATE+1 will be tomorrow. You cannot multiply
or divide DATE values.
With the help of TO_CHAR, string operations can be used on DATE values as well. For example,
to_char(<date>, 'DD-MON-YY') like '%JUN%' evaluates to true if <date> is in June.
This document was written originally by Kristian Widjaja for Prof. Jeff Ullman's CS145 class in Autumn, 1997; revised by Jun Yang for Prof. Jennifer
Widom's CS145 class in Spring, 1998; further revisions by Prof. Ullman in Autumn, 1998.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and Jennifer
Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
BEGIN
EXCEPTION
END;
Only the executable section is required. The other sections are optional. The only SQL
statements allowed in a PL/SQL program are SELECT, INSERT, UPDATE, DELETE and several other
data manipulation statements plus some transaction control. However, the SELECT statement has
a special form in which a single tuple is placed in variables; more on this later. Data definition
statements like CREATE, DROP, or ALTER are not allowed. The executable section also contains
constructs such as assignments, branches, loops, procedure calls, and triggers, which are all
described below (except triggers). PL/SQL is not case sensitive. C style comments (/* ... */)
may be used.
To execute a PL/SQL program, we must follow the program text itself by
• A line with a single dot ("."), and then
• A line with run;
As with Oracle SQL programs, we can invoke a PL/SQL program either by typing it in sqlplus
or by putting the code in a file and invoking the file in the various ways we learned in Getting
Started With Oracle.
price NUMBER;
myBeer VARCHAR(20);
Note that PL/SQL allows BOOLEAN variables, even though Oracle does not support BOOLEAN as a
type for database columns.
Types in PL/SQL can be tricky. In many cases, a PL/SQL variable will be used to manipulate
data stored in a existing relation. In this case, it is essential that the variable have the same type
as the relation column. If there is any type mismatch, variable assignments and comparisons may
not work the way you expect. To be safe, instead of hard coding the type of a variable, you
should use the %TYPE operator. For example:
DECLARE
myBeer Beers.name%TYPE;
gives PL/SQL variable myBeer whatever type was declared for the name column in relation
Beers.
A variable may also have a type that is a record with several fields. The simplest way to declare
such a variable is to use %ROWTYPE on a relation name. The result is a record type in which the
fields have the same names and types as the attributes of the relation. For instance:
DECLARE
beerTuple Beers%ROWTYPE;
makes variable beerTuple be a record with fields name and manufacture, assuming that the
relation has the schema Beers(name, manufacture).
The initial value of any variable, regardless of its type, is NULL. We can assign values to
variables, using the ":=" operator. The assignment can occur either immediately after the type of
the variable is declared, or anywhere in the executable portion of the program. An example:
DECLARE
a NUMBER := 3;
BEGIN
a := a + 1;
END;
run;
This program has no effect when run, because there are no changes to the database.
e INTEGER,
f INTEGER
);
DECLARE
a NUMBER;
b NUMBER;
BEGIN
END;
run;
Fortuitously, there is only one tuple of T1 that has first component greater than 1, namely (2,4).
The INSERT statement thus inserts (4,2) into T1.
Control Flow in PL/SQL
PL/SQL allows you to branch and create loops in a fairly familiar way.
An IF statement looks like:
IF <condition> THEN <statement_list> ELSE <statement_list> END IF;
The ELSE part is optional. If you want a multiway branch, use:
IF <condition_1> THEN ...
... ...
ELSE ...
END IF;
The following is an example, slightly modified from the previous one, where now we only do the
insertion if the second component is 1. If not, we first add 10 to each component and then insert:
DECLARE
a NUMBER;
b NUMBER;
BEGIN
IF b=1 THEN
ELSE
END IF;
END;
run;
Loops are created with the following:
LOOP
END LOOP;
At least one of the statements in <loop_body> should be an EXIT statement of the form
EXIT WHEN <condition>;
The loop breaks if <condition> is true. For example, here is a way to insert each of the pairs (1,
1) through (100, 100) into T1 of the above two examples:
DECLARE
i NUMBER := 1;
BEGIN
LOOP
i := i+1;
END LOOP;
END;
run;
Some other useful loop-forming statements are:
• EXIT by itself is an unconditional loop break. Use it inside a conditional if you like.
• A WHILE loop can be formed with
• WHILE <condition> LOOP
•
• <loop_body>
•
END LOOP;
• A simple FOR loop can be formed with:
• FOR <var> IN <start>..<finish> LOOP
•
• <loop_body>
•
END LOOP;
Here, <var> can be any variable; it is local to the for-loop and need not be declared. Also,
<start> and <finish> are constants.
Cursors
A cursor is a variable that runs through the tuples of some relation. This relation can be a stored
table, or it can be the answer to some query. By fetching into the cursor each tuple of the
relation, we can write a program to read and process the value of each such tuple. If the relation
is stored, we can also update or delete the tuple at the current cursor position.
The example below illustrates a cursor loop. It uses our example relation T1(e,f) whose tuples
are pairs of integers. The program will delete every tuple whose first component is less than the
second, and insert the reverse tuple into T1.
1) DECLARE
3) b T1.f%TYPE;
/* Cursor declaration: */
4) CURSOR T1Cursor IS
5) SELECT e, f
6) FROM T1
7) WHERE e < f
8) FOR UPDATE;
9) BEGIN
11) LOOP
18) END;
19) .
20) run;
Here are explanations for the various lines of this program:
• Line (1) introduces the declaration section.
• Lines (2) and (3) declare variables a and b to have types equal to the types of attributes e
and f of the relation T1. Although we know these types are INTEGER, we wisely make
sure that whatever types they may have are copied to the PL/SQL variables (compare
with the previous example, where we were less careful and declared the corresponding
variables to be of type NUMBER).
• Lines (4) through (8) define the cursor T1Cursor. It ranges over a relation defined by the
SELECT-FROM-WHERE query. That query selects those tuples of T1 whose first component
is less than the second component. Line (8) declares the cursor FOR UPDATE since we will
modify T1 using this cursor later on Line (14). In general, FOR UPDATE is unnecessary if
the cursor will not be used for modification.
• Line (9) begins the executable section of the program.
• Line (10) opens the cursor, an essential step.
• Lines (11) through (16) are a PL/SQL loop. Notice that such a loop is bracketed by LOOP
and END LOOP. Within the loop we find:
○ On Line (12), a fetch through the cursor into the local variables. In general, the
FETCH statement must provide variables for each component of the tuple retrieved.
Since the query of Lines (5) through (7) produces pairs, we have correctly
provided two variables, and we know they are of the correct type.
○ On Line (13), a test for the loop-breaking condition. Its meaning should be clear:
%NOTFOUND after the name of a cursor is true exactly when a fetch through that
cursor has failed to find any more tuples.
○ On Line (14), a SQL DELETE statement that deletes the current tuple using the
special WHERE condition CURRENT OF T1Cursor.
○ On Line (15), a SQL INSERT statement that inserts the reverse tuple into T1.
• Line (17) closes the cursor.
• Line (18) ends the PL/SQL program.
• Lines (19) and (20) cause the program to execute.
Procedures
PL/SQL procedures behave very much like procedures in other programming language. Here is
an example of a PL/SQL procedure addtuple1 that, given an integer i, inserts the tuple (i,
'xxx') into the following example relation:
CREATE TABLE T2 (
a INTEGER,
b CHAR(10)
);
END addtuple1;
run;
A procedure is introduced by the keywords CREATE PROCEDURE followed by the procedure name
and its parameters. An option is to follow CREATE by OR REPLACE. The advantage of doing so is
that should you have already made the definition, you will not get an error. On the other hand,
should the previous definition be a different procedure of the same name, you will not be
warned, and the old procedure will be lost.
There can be any number of parameters, each followed by a mode and a type. The possible
modes are IN (read-only), OUT (write-only), and INOUT (read and write). Note: Unlike the type
specifier in a PL/SQL variable declaration, the type specifier in a parameter declaration must be
unconstrained. For example, CHAR(10) and VARCHAR(20) are illegal; CHAR or VARCHAR should be
used instead. The actual length of a parameter depends on the corresponding argument that is
passed in when the procedure is invoked.
Following the arguments is the keyword AS (IS is a synonym). Then comes the body, which is
essentially a PL/SQL block. We have repeated the name of the procedure after the END, but this
is optional. However, the DECLARE section should not start with the keyword DECLARE. Rather,
following AS we have:
... AS
<local_var_declarations>
BEGIN
<procedure_body>
END;
run;
The run at the end runs the statement that creates the procedure; it does not execute the
procedure. To execute the procedure, use another PL/SQL statement, in which the procedure is
invoked as an executable statement. For example:
BEGIN addtuple1(99); END;
run;
The following procedure also inserts a tuple into T2, but it takes both components as arguments:
CREATE PROCEDURE addtuple2(
x T2.a%TYPE,
y T2.b%TYPE)
AS
BEGIN
VALUES(x, y);
END addtuple2;
run;
Now, to add a tuple (10, 'abc') to T2:
BEGIN
addtuple2(10, 'abc');
END;
run;
The following illustrates the use of an OUT parameter:
CREATE TABLE T3 (
a INTEGER,
b INTEGER
);
AS
BEGIN
b := 4;
END;
run;
DECLARE
v NUMBER;
BEGIN
addtuple3(10, v);
END;
run;
Note that assigning values to parameters declared as OUT or INOUT causes the corresponding
input arguments to be written. Because of this, the input argument for an OUT or INOUT parameter
should be something with an "lvalue", such as a variable like v in the example above. A constant
or a literal argument should not be passed in for an OUT/INOUT parameter.
We can also write functions instead of procedures. In a function declaration, we follow the
parameter list by RETURN and the type of the return value:
CREATE FUNCTION <func_name>(<param_list>) RETURN <return_type> AS ...
In the body of the function definition, "RETURN <expression>;" exits from the function and
returns the value of <expression>.
To find out what procedures and functions you have created, use the following SQL query:
select object_type, object_name
from user_objects
or object_type = 'FUNCTION';
To drop a stored procedure/function:
drop procedure <procedure_name>;
Discovering Errors
PL/SQL does not always tell you about compilation errors. Instead, it gives you a cryptic
message such as "procedure created with compilation errors". If you don't see what is wrong
immediately, try issuing the command
show errors procedure <procedure_name>;
Alternatively, you can type, SHO ERR (short for SHOW ERRORS) to see the most recent compilation
error.
Note that the location of the error given as part of the error message is not always accurate!
Printing Variables
Sometimes we might want to print the value of a PL/SQL local variable. A ``quick-and-dirty''
way is to store it as the sole tuple of some relation and after the PL/SQL statement print the
relation with a SELECT statement. A more couth way is to define a bind variable, which is the
only kind that may be printed with a print command. Bind variables are the kind that must be
prefixed with a colon in PL/SQL statements, such as :new discussed in the section on triggers.
The steps are as follows:
1. We declare a bind variable as follows:
VARIABLE <name> <type>
where the type can be only one of three things: NUMBER, CHAR, or CHAR(n).
2. We may then assign to the variable in a following PL/SQL statement, but we must prefix
it with a colon.
3. Finally, we can execute a statement
PRINT :<name>;
outside the PL/SQL statement
Here is a trivial example, which prints the value 1.
VARIABLE x NUMBER
BEGIN
:x := 1;
END;
run;
PRINT :x;
This document was written originally by Yu-May Chang and Jeff Ullman for CS145, Autumn 1997; revised by Jun Yang for Prof. Jennifer Widom's
CS145 class in Spring, 1998; additional material by Jeff Ullman, Autumn 1998; further revisions by Jun Yang, Spring 1999; minor revisions by Jennifer
Widom, Spring 2000.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and
Jennifer Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Triggers are a special PL/SQL construct similar to procedures. However, a procedure is executed
explicitly from another block via a procedure call, while a trigger is executed implicitly
whenever the triggering event happens. The triggering event is either a INSERT, DELETE, or
UPDATE command. The timing can be either BEFORE or AFTER. The trigger can be either
row-level or statement-level, where the former fires once for each row affected by the triggering
statement and the latter fires once for the whole statement.
• Constraints:
○ Deferring Constraint Checking
○ Constraint Violations
• Triggers:
○ Basic Trigger Syntax
○ Trigger Example
○ Displaying Trigger Definition Errors
○ Viewing Defined Triggers
○ Dropping Triggers
○ Disabling Triggers
○ Aborting Triggers with Error
○ Mutating Table Errors
To work around this problem, we need SQL schema modification commands. First, create
chicken and egg without foreign key declarations:
CREATE TABLE chicken(cID INT PRIMARY KEY,
eID INT);
CREATE TABLE egg(eID INT PRIMARY KEY,
cID INT);
Then, we add foreign key constraints:
Finally, to get rid of the tables, we have to drop the constraints first, because Oracle won't allow
us to drop a table that's referenced by another table.
ALTER TABLE egg DROP CONSTRAINT eggREFchicken;
ALTER TABLE chicken DROP CONSTRAINT chickenREFegg;
DROP TABLE egg;
DROP TABLE chicken;
Constraint Violations
In general, Oracle returns an error message when a constraint is violated.
Specifically for users of JDBC, this means an SQLException gets thrown, whereas for
Pro*C users the SQLCA struct gets updated to reflect the error. Programmers must
use the WHENEVER statement and/or check the SQLCA contents (Pro*C users) or
catch the exception SQLException (JDBC users) in order to get the error code
returned by Oracle.
Some vendor specific error code numbers are 1 for primary key constraint violations, 2291 for
foreign key violations, 2290 for attribute and tuple CHECK constraint violations. Oracle also
provides simple error message strings that have a format similar to the following:
ORA-02290: check constraint (YFUNG.GR_GR) violated
or
ORA-02291: integrity constraint (HONDROUL.SYS_C0067174) violated - parent
key not found
For more details on how to do error handling, please take a look at Pro*C Error handling or at
the Retrieving Exceptions section of JDBC Error handling.
<trigger_body>
Some important points to note:
• You can create only BEFORE and AFTER triggers for tables. (INSTEAD OF triggers
are only available for views; typically they are used to implement view
updates.)
• You may specify up to three triggering events using the keyword OR.
Furthermore, UPDATE can be optionally followed by the keyword OF and a list
of attribute(s) in <table_name>. If present, the OF clause defines the event to
be only an update of the attribute(s) listed after OF. Here are some examples:
• ... INSERT ON R ...
•
• ... INSERT OR DELETE OR UPDATE ON R ...
•
... UPDATE OF A, B OR INSERT ON R ...
• If FOR EACH ROW option is specified, the trigger is row-level; otherwise, the
trigger is statement-level.
• Only for row-level triggers:
○ The special variables NEW and OLD are available to refer to new and old
tuples respectively. Note: In the trigger body, NEW and OLD must be
preceded by a colon (":"), but in the WHEN clause, they do not have a
preceding colon! See example below.
○ The REFERENCING clause can be used to assign aliases to the variables
NEW and OLD.
○ A trigger restriction can be specified in the WHEN clause, enclosed by
parentheses. The trigger restriction is a SQL condition that must be
satisfied in order for Oracle to fire the trigger. This condition cannot
contain subqueries. Without the WHEN clause, the trigger is fired for
each row.
• <trigger_body> is a PL/SQL block, rather than sequence of SQL statements.
Oracle has placed certain restrictions on what you can do in <trigger_body>,
in order to avoid situations where one trigger performs an action that triggers
a second trigger, which then triggers a third, and so on, which could
potentially create an infinite loop. The restrictions on <trigger_body> include:
○ You cannot modify the same relation whose modification is the event
triggering the trigger.
○ You cannot modify a relation connected to the triggering relation by
another constraint such as a foreign-key constraint.
Trigger Example
We illustrate Oracle's syntax for creating a trigger through an example based on the
following two tables:
Dropping Triggers
To drop a trigger:
Disabling Triggers
To disable or enable a trigger:
• A row-level trigger must not query or modify a mutating table. (Of course,
NEW and OLD still can be accessed by the trigger.)
• A statement-level trigger must not query or modify a mutating table if the
trigger is fired as the result of a CASCADE delete.
This document was written originally by Yu-May Chang and Jeff Ullman for CS145 in Autumn, 1997; revised by Jun Yang for
Prof. Jennifer Widom's CS145 class in Spring, 1998; further revisions by Jun Yang, Spring 1999; further revisions by Jennifer
Widom, Spring 2000; minor revisions by Nathan Folkert, Spring 2001, Henry Hsieh, Autumn 2001, Antonios Hondroulis,
Spring 2002, and Glen Jeh, Spring 2002.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and
Jennifer Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Introduction to Pro*C
Embedded SQL
• Overview
• Pro*C Syntax
○ SQL
○ Preprocessor Directives
○ Statement Labels
• Host Variables
○ Basics
○ Pointers
○ Structures
○ Arrays
○ Indicator Variables
○ Datatype Equivalencing
• Dynamic SQL
• Transactions
• Error Handling
○ SQLCA
○ WHENEVER Statement
• Demo Programs
• C++ Users
• List of Embedded SQL Statements Supported by Pro*C
Overview
Embedded SQL is a method of combining the computing power of a high-level
language like C/C++ and the database manipulation capabilities of SQL. It allows
you to execute any SQL statement from an application program. Oracle's embedded
SQL environment is called Pro*C.
A Pro*C program is compiled in two steps. First, the Pro*C precompiler recognizes the SQL
statements embedded in the program, and replaces them with appropriate calls to the functions in
the SQL runtime library. The output is pure C/C++ code with all the pure C/C++ portions intact.
Then, a regular C/C++ compiler is used to compile the code and produces the executable. For
details, see the section on Demo Programs.
Pro*C Syntax
SQL
All SQL statements need to start with EXEC SQL and end with a semicolon ";". You
can place the SQL statements anywhere within a C/C++ block, with the restriction
that the declarative statements do not come after the executable statements. As an
example:
{
int a;
/* ... */
EXEC SQL SELECT salary INTO :a
FROM Employee
WHERE SSN=876543210;
/* ... */
printf("The salary is %d\n", a);
/* ... */
}
Preprocessor Directives
The C/C++ preprocessor directives that work with Pro*C are #include and #if.
Pro*C does not recognize #define. For example, the following code is invalid:
Statement Labels
You can connect C/C++ labels with SQL as in:
Host Variables
Basics
Host variables are the key to the communication between the host program and the
database. A host variable expression must resolve to an lvalue (i.e., it can be
assigned). You can declare host variables according to C syntax, as you declare
regular C variables. The host variable declarations can be placed wherever C
variable declarations can be placed. (C++ users need to use a declare section; see
the section on C++ Users.) The C datatypes that can be used with Oracle include:
• char
• char[n]
• int
• short
• long
• float
• double
• VARCHAR[n] - This is a psuedo-type recognized by the Pro*C precompiler. It is
used to represent blank-padded, variable-length strings. Pro*C precompiler
will convert it into a structure with a 2-byte length field and a n-byte
character array.
You cannot use register storage-class specifier for the host variables.
A host variable reference must be prefixed with a colon ":" in SQL statements, but should not be
prefixed with a colon in C statements. When specifying a string literal via a host variable, the
single quotes must be omitted; Pro*C understands that you are specifying a string based on the
declared type of the host variable. C function calls and most of the pointer arithmetic expressions
cannot be used as host variable references even though they may indeed resolve to lvalues. The
following code illustrates both legal and illegal host variable references:
int deptnos[3] = { 000, 111, 222 };
int get_deptno() { return deptnos[2]; }
int *get_deptnoptr() { return &(deptnos[2]); }
int main() {
int x; char *y; int z;
/* ... */
EXEC SQL INSERT INTO emp(empno, ename, deptno)
VALUES(:x, :y, :z); /* LEGAL */
EXEC SQL INSERT INTO emp(empno, ename, deptno)
VALUES(:x + 1, /* LEGAL: the reference is to x */
'Big Shot', /* LEGAL: but not really a host var */
:deptnos[2]); /* LEGAL: array element is fine */
EXEC SQL INSERT INTO emp(empno, ename, deptno)
VALUES(:x, :y,
:(*(deptnos+2))); /* ILLEGAL: although it has an
lvalue */
EXEC SQL INSERT INTO emp(empno, ename, deptno)
VALUES(:x, :y,
:get_deptno()); /* ILLEGAL: no function calls */
EXEC SQL INSERT INTO emp(empno, ename, deptno)
VALUES(:x, :y,
:(*get_depnoptr())); /* ILLEGAL: although it has an lvalue */
/* ... */
}
Pointers
You can define pointers using the regular C syntax, and use them in embedded SQL
statements. As usual, prefix them with a colon:
int *x;
/* ... */
EXEC SQL SELECT xyz INTO :x FROM ...;
The result of this SELECT statement will be written into *x, not x.
Structures
Structures can be used as host variables, as illustrated in the following example:
typedef struct {
char name[21]; /* one greater than column length; for '\0' */
int SSN;
} Emp;
/* ... */
Emp bigshot;
/* ... */
EXEC SQL INSERT INTO emp (ename, eSSN)
VALUES (:bigshot);
Arrays
Host arrays can be used in the following way:
int emp_number[50];
char name[50][11];
/* ... */
EXEC SQL INSERT INTO emp(emp_number, name)
VALUES (:emp_number, :emp_name);
which will insert all the 50 tuples in one go.
Arrays can only be single dimensional. The example char name[50][11] would seem to
contradict that rule. However, Pro*C actually considers name a one-dimensional array of strings
rather than a two-dimensional array of characters. You can also have arrays of structures.
When using arrays to store the results of a query, if the size of the host array (say n) is smaller
than the actual number of tuples returned by the query, then only the first n result tuples will be
entered into the host array.
Indicator Variables
Indicator variables are essentially "NULL flags" attached to host variables. You can
associate every host variable with an optional indicator variable. An indicator
variable must be defined as a 2-byte integer (using the type short) and, in SQL
statements, must be prefixed by a colon and immediately follow its host variable.
Or, you may use the keyword INDICATOR in between the host variable and indicator
variable. Here is an example:
short indicator_var;
EXEC SQL SELECT xyz INTO :host_var:indicator_var
FROM ...;
/* ... */
EXEC SQL INSERT INTO R
VALUES(:host_var INDICATOR :indicator_var, ...);
You can use indicator variables in the INTO clause of a SELECT statement to detect
NULL's or truncated values in the output host variables. The values Oracle can assign
to an indicator variable have the following meanings:
-1 The column value is NULL, so the value of the host variable is indeterminate.
>0 Oracle assigned a truncated column value to the host variable. The integer
returned by the indicator variable is the original length of the column value.
Oracle assigned a truncated column variable to the host variable, but the
-2
original column value could not be determined.
You can also use indicator variables in the VALUES and SET clause of an INSERT or UPDATE
statement to assign NULL's to input host variables. The values your program can assign to an
indicator variable have the following meanings:
-1 Oracle will assign a NULL to the column, ignoring the value of the host variable.
>=0 Oracle will assign the value of the host variable to the column.
Datatype Equivalencing
Oracle recognizes two kinds of datatypes: internal and external. Internal datatypes
specify how Oracle stores column values in database tables. External datatypes
specify the formats used to store values in input and output host variables. At
precompile time, a default Oracle external datatype is assigned to each host
variable. Datatype equivalencing allows you to override this default equivalencing
and lets you control the way Oracle interprets the input data and formats the output
data.
The equivalencing can be done on a variable-by-variable basis using the VAR statement. The
syntax is:
EXEC SQL VAR <host_var> IS <type_name> [ (<length>) ];
For example, suppose you want to select employee names from the emp table, and
then pass them to a routine that expects C-style '\0'-terminated strings. You need
not explicitly '\0'-terminate the names yourself. Simply equivalence a host variable
to the STRING external datatype, as follows:
char emp_name[21];
EXEC SQL VAR emp_name IS STRING(21);
The length of the ename column in the emp table is 20 characters, so you allot
emp_name 21 characters to accommodate the '\0'-terminator. STRING is an Oracle
external datatype specifically designed to interface with C-style strings. When you
select a value from the ename column into emp_name, Oracle will automatically '\0'-
terminate the value for you.
You can also equivalence user-defined datatypes to Oracle external datatypes using the TYPE
statement. The syntax is:
EXEC SQL TYPE <user_type> IS <type_name> [ (<length>) ] [REFERENCE];
You can declare a user-defined type to be a pointer, either explicitly, as a pointer to
a scalar or structure, or implicitly as an array, and then use this type in a TYPE
statement. In these cases, you need to use the REFERENCE clause at the end of the
statement, as shown below:
Dynamic SQL
While embedded SQL is fine for fixed applications, sometimes it is important for a
program to dynamically create entire SQL statements. With dynamic SQL, a
statement stored in a string variable can be issued. PREPARE turns a character string
into a SQL statement, and EXECUTE executes that statement. Consider the following
example.
If your program exits without calling EXEC SQL COMMIT, all database changes will be discarded.
Error HandlingAfter each executable SQL statement, your program can find the
status of execution either by explicit checking of SQLCA, or by implicit checking
using the WHENEVER statement. These two ways are covered in details below.
SQLCA
SQLCA (SQL Communications Area) is used to detect errors and status changes in
your program. This structure contains components that are filled in by Oracle at
runtime after every executable SQL statement.
To use SQLCA you need to include the header file sqlca.h using the #include directive. In
case you need to include sqlca.h at many places, you need to first undefine the macro SQLCA
with #undef SQLCA. The relevant chunk of sqlca.h follows:
#ifndef SQLCA
#define SQLCA 1
struct sqlca {
/* ub1 */ char sqlcaid[8];
/* b4 */ long sqlabc;
/* b4 */ long sqlcode;
struct {
/* ub2 */ unsigned short sqlerrml;
/* ub1 */ char sqlerrmc[70];
} sqlerrm;
/* ub1 */ char sqlerrp[8];
/* b4 */ long sqlerrd[6];
/* ub1 */ char sqlwarn[8];
/* ub1 */ char sqlext[8];
};
/* ... */
The fields in sqlca have the following meaning:
sqlcabc This integer component holds the length, in bytes, of the SQLCA structure.
sqlcode This integer component holds the status code of the most recently
executed SQL statement:
0 No error.
sqlwarn This array of single characters has eight elements used as warning flags.
Oracle sets a flag by assigning to it the character 'W'.
SQLCA can only accommodate error messages up to 70 characters long in its sqlerrm
component. To get the full text of longer (or nested) error messages, you need the sqlglm()
function:
void sqlglm(char *msg_buf, size_t *buf_size, size_t *msg_length);
where msg_buf is the character buffer in which you want Oracle to store the error
message; buf_size specifies the size of msg_buf in bytes; Oracle stores the actual
length of the error message in *msg_length. The maximum length of an Oracle error
message is 512 bytes.
WHENEVER Statement
This statement allows you to do automatic error checking and handling. The syntax
is:
• CONTINUE - Program will try to continue to run with the next statement if
possible
• DO - Program transfers control to an error handling function
• GOTO <label> - Program branches to a labeled statement
• STOP - Program exits with an exit() call, and uncommitted work is rolled back
Some examples of the WHENEVER statement:
EXEC SQL WHENEVER SQLWARNING DO print_warning_msg();
EXEC SQL WHENEVER NOT FOUND GOTO handle_empty;
Here is a more concrete example:
Demo Programs
Note: The demo programs will create and use four tables named DEPT, EMP, PAY1,
and PAY2. Be careful if any table in your database happens to have the same name!
You should take a look at the sample source code before running it. The comments at the top
describe what the program does. For example, sample1 takes an employee's EMPNO and retrieve
the name, salary, and commission for that employee from the table EMP.
You are supposed to study the sample source code and learn the following:
• How to connect to Oracle from the host program
• How to embed SQL in C/C++
• How to use cursors
• How to use host variables to communicate with the database
• How to use WHENEVER to take different actions on error messages.
• How to use indicator variables to detect NULL's in the output
Now, you can use these techniques to code your own database application program.
And have fun!
C++ Users
To get the precompiler to generate appropriate C++ code, you need to be aware of
the following issues:
• Code emission by precompiler. To get C++ code, you need to set the option
CODE=CPP while executing proc. C users need not worry about this option; the
default caters to their needs.
• Parsing capability. The PARSE option of proc may take the following values:
○ PARSE=NONE. C preprocessor directives are understood only inside a
declare section, and all host variables need to be declared inside a
declare section.
○ PARSE=PARTIAL. C preprocessor directives are understood; however, all
host variables need to be declared inside a declare section.
○ PARSE=FULL. C preprocessor directives are understood and host
variables can be declared anywhere. This is the default when CODE is
anything other than CPP; it is an error to specify PARSE=FULL with
CODE=CPP.
So, C++ users must specify PARSE=NONE or PARSE=PARTIAL. They therefore lose
the freedom to declare host variables anywhere in the code. Rather, the host
variables must be encapsulated in declare sections as follows:
EXEC SQL BEGIN DECLARE SECTION;
// declarations...
EXEC SQL END DECLARE SECTION;
You need to follow this routine for declaring the host and indicator variables
at all the places you do so.
Executable Statements
This document was written originally by Ankur Jain and Jeff Ullman for CS145, Autumn 1997; revised by Jun Yang for Prof.
Jennifer Widom's CS145 class in Spring, 1998; further revisions by Roy Goldman for Prof. Jeff Ullman's CS145 class in
Autumn, 1999; further revisions by Calvin Yang for Prof. Jennifer Widom's CS145 class in Spring, 2002.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and
Jennifer Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Introduction to JDBC
This document illustrates the basics of the JDBC (Java Database Connectivity) API
(Application Program Interface). Here, you will learn to use the basic JDBC API to
create tables, insert values, query tables, retrieve results, update tables, create
prepared statements, perform transactions and catch exceptions and errors.
This document draws from the official Sun tutorial on JDBC Basics.
• Overview
• Establishing a Connection
• Creating a JDBC Statement
• Creating a JDBC PreparedStatement
• Executing CREATE/INSERT/UPDATE Statements
• Executing SELECT Statements
• Notes on Accessing ResultSet
• Transactions
• Handling Errors with Exceptions
• Sample Code and Compilation Instructions
Overview
Call-level interfaces such as JDBC are programming interfaces allowing external
access to SQL database manipulation and update commands. They allow the
integration of SQL calls into a general programming environment by providing
library routines which interface with the database. In particular, Java based JDBC
has a rich collection of routines which make such an interface extremely simple and
intuitive.
Here is an easy way of visualizing what happens in a call level interface: You are writing a
normal Java program. Somewhere in the program, you need to interact with a database. Using
standard library routines, you open a connection to the database. You then use JDBC to send
your SQL code to the database, and process the results that are returned. When you are done, you
close the connection.
Such an approach has to be contrasted with the precompilation route taken with Embedded SQL.
The latter has a precompilation step, where the embedded SQL code is converted to the host
language code(C/C++). Call-level interfaces do not require precompilation and thus avoid some
of the problems of Embedded SQL. The result is increased portability and a cleaner client-server
relationship.
Establishing A Connection
The first thing to do, of course, is to install Java, JDBC and the DBMS on your
working machines. Since we want to interface with an Oracle database, we would
need a driver for this specific database as well. Fortunately, we have a responsible
administrator who has already done all this for us on the Leland machines.
As we said earlier, before a database can be accessed, a connection must be opened between our
program(client) and the database(server). This involves two steps:
• Load the vendor specific driver
Why would we need this step? To ensure portability and code reuse, the API was
designed to be as independent of the version or the vendor of a database as possible.
Since different DBMS's have different behavior, we need to tell the driver manager
which DBMS we wish to use, so that it can invoke the correct driver.
An Oracle driver is loaded using the following code snippet:
Class.forName("oracle.jdbc.driver.OracleDriver")
• Make the connection
Once the driver is loaded and ready for a connection to be made, you may create an
instance of a Connection object using:
Connection con = DriverManager.getConnection(
"jdbc:oracle:thin:@dbaprod1:1544:SHR1_PRD", username, passwd);
Okay, lets see what this jargon is. The first string is the URL for the database including
the protocol (jdbc), the vendor (oracle), the driver (thin), the server (dbaprod1), the port
number (1521), and a server instance (SHR1_PRD). The username and passwd are your
username and password, the same as you would enter into SQLPLUS to access your
account.
That's it! The connection returned in the last step is an open connection which we will use to
pass SQL statements to the database. In this code snippet, con is an open connection, and we will
use it below. Note:The values mentioned above are valid for our (Leland) environment. They
would have different values in other environments.
Transactions
JDBC allows SQL statements to be grouped together into a single transaction. Thus, we can
ensure the ACID (Atomicity, Consistency, Isolation, Durability) properties using JDBC
transactional features.
Transaction control is performed by the Connection object. When a connection is created, by
default it is in the auto-commit mode. This means that each individual SQL statement is treated
as a transaction by itself, and will be committed as soon as it's execution finished. (This is not
exactly precise, but we can gloss over this subtlety for most purposes).
We can turn off auto-commit mode for an active connection with :
con.setAutoCommit(false) ;
and turn it on again with :
con.setAutoCommit(true) ;
Once auto-commit is off, no SQL statements will be committed (that is, the database will not be
permanently updated) until you have explicitly told it to commit by invoking the commit()
method:
con.commit() ;
At any point before commit, we may invoke rollback() to rollback the transaction,
and restore values to the last commit point (before the attempted updates).
}catch(SQLException ex) {
System.err.println("SQLException: " + ex.getMessage()) ;
con.rollback() ;
con.setAutoCommit(true) ;
}
In this case, an exception is thrown because beer is defined as VARHAR2 which is a mis-spelling.
Since there is no such data type in our DBMS, an SQLException is thrown. The output in this
case would be:
Message: ORA-00902: invalid datatype
Alternatively, if your datatypes were correct, an exception might be thrown in case
your database size goes over space quota and is unable to construct a new table.
SQLWarnings can be retrieved from Connection objects, Statement objects, and
ResultSet objects. Each only stores the most recent SQLWarning. So if you execute
another statement through your Statement object, for instance, any earlier warnings
will be discarded. Here is a code snippet which illustrates the use of SQLWarnings:
We have a few more pieces of sample code written by Craig Jurney at ITSS for educational
purposes. Feel free to use sample code as a guideline or even a skeleton for code that you write
in the future, but make a note that you were basing your solution on provided code.
SQLBuilder.java - Creation of a Relation
SQLLoader.java - Insertion of Tuples
SQLRunner.java - Processes Queries
SQLUpdater.java - Updating Tuples
SQLBatchUpdater.java - Batch Updating
SQLUtil.java - JDBC Utility Functions
Don't forget to use source /usr/class/cs145/all.env, which will correctly set your
classpath. By adding this to your global classpath you simplify commands. For example, you can
say:
elaine19:~$ javac SQLBuilder.java
elaine19:~$ java SQLBuilder
instead of:
elaine19:~$ javac SQLBuilder.java
elaine19:~$ java -classpath
/usr/pubsw/apps/oracle/8.1.5/jdbc/lib/classes111.zip:. SQLBuilder
There are static final values in each of the .java files for USERNAME and PASSWORD.
These must be changed to your own username and your own password so that you
can access the database.
This document was written originally by Nathan Folkert for Prof. Jennifer Widom's CS145 class, Spring 2000.
Subsequently, it was hacked by Mayank Bawa for Prof. Jeff Ullman's CS145 class, Fall 2000. Jim Zhuang made
a minor update for Summer 2005. Thanks to Matt Laue for typo correction.
Resources
• Database Systems: The Complete Book by Hector Garcia, Jeff Ullman, and Jennifer
Widom.
• A First Course in Database Systems by Jeff Ullman and Jennifer Widom.
• Gradiance SQL Tutorial.
Defining Types
Oracle allows us to define types similar to the types of SQL. The syntax is
CREATE TYPE t AS OBJECT (
list of attributes and methods
);
/
• Note the slash at the end, needed to get Oracle to process the type definition.
For example here is a definition of a point type consisting of two numbers:
CREATE TYPE PointType AS OBJECT (
x NUMBER,
y NUMBER
);
/
An object type can be used like any other type in further declarations of object-types or table-
types. For instance, we might define a line type by:
CREATE TYPE LineType AS OBJECT (
end1 PointType,
end2 PointType
);
/
Then, we could create a relation that is a set of lines with ``line ID's'' as:
CREATE TABLE Lines (
lineID INT,
line LineType
);
Dropping Types
To get rid of a type such as LineType, we say:
DROP TYPE Linetype;
However, before dropping a type, we must first drop all tables and other types that use this type.
Thus, the above would fail because table Lines still exists and uses LineType.
References as a Type
For every type t, REF t is the type of references (object ID's if you will) to values of type t. This
type can be used in places where a type is called for. For instance, we could create a relation
Lines2 whose tuples were pairs of references to points:
CREATE TABLE Lines2 (
end1 REF PointType,
end2 REF PointType
);
We can use REF to create references from actual values. For example, suppose we have a relation
Points whose tuples are objects of type PointType. That is, Points is declared by:
CREATE TABLE Points OF PointType;
We could make Lines2 be the set of all lines between pairs of these points that go from left to
right (i.e., the x-value of the first is less than the x-value of the second) by:
INSERT INTO Lines2
SELECT REF(pp), REF(qq)
FROM Points pp, Points qq
WHERE pp.x < qq.x;
There are several important prohibitions, where you might imagine you could arrange for a
reference to an object, but you cannot.
• The points referred to must be tuples of a relation of type PointType, such as Points
above. They cannot be objects appearing in some column of another relation.
• It is not permissible to invent an object outside of any relation and try to make a reference
to it. For instance, we could not insert into Lines2 a tuple with contrived references such
as VALUES(REF(PointType(1,2)), REF(PointType(3,4))), even though the types of
things are right. The problem is that the points such as PointType(1,2) don't ``live'' in
any relation.
To follow a reference, we use the dot notation, as if the attribute of reference type were really the
same as the value referred to. For instance, this query gets the x-coordinates of the ends of all the
lines in Lines2.
SELECT ll.end1.x, ll.end2.x
FROM Lines2 ll;
Nested Tables
A more powerful use of object types in Oracle is the fact that the type of a column can be a table-
type. That is, the value of an attribute in one tuple can be an entire relation, as suggested by the
picture below, where a relation with schema (a,b) has b-values that are relations with schema
(x,y,z).
a b
x y z
- - -
-
- - -
- - -
x y z
-
- - -
x y z
- - - -
- - -
In order to have a relation as a type of some attribute, we first have to define a type using the AS
TABLE OF clause. For instance:
CREATE TYPE PolygonType AS TABLE OF PointType;
/
says that the type PolygonType is a relation whose tuples are of type PointType; i.e., they have
two components, x and y, which are real numbers.
Now, we can declare a relation one of whose columns has values that represent polygons; i.e.,
they are sets of points. A possible declaration, in which polygons are represented by a name and
a set of points is:
CREATE TABLE Polygons (
name VARCHAR2(20),
points PolygonType)
NESTED TABLE points STORE AS PointsTable;
The ``tiny'' relations that represent individual polygons are not stored directly as values of the
points attribute. Rather, they are stored in a single table, whose name must be declared
(although we cannot refer to it in any way). We see this declaration following the parenthesized
list of attributes for the table; the name PointsTable was chosen to store the relations of type
PolygonType.
• Be careful to get the punctuation right. There is one semicolon ending the CREATE TABLE
statement, and it goes after both the parenthesized list of attributes and the NESTED TABLE
clause.
When we insert into a relation like Polygons that has one or more columns that are of nested-
relation type, we use the type constructor for the nested-relation type (PolygonType in our
example) to surround the value of one of these nested relations. The value of the nested relation
is represented by a list of values of the appropriate type; in our example that type is PointType
and is represented by the type constructor of the same name.
Here is a statement inserting a polygon named ``square'' that consists of four points, the corners
of the unit square.
INSERT INTO Polygons VALUES(
'square', PolygonType(PointType(0.0, 0.0), PointType(0.0, 1.0),
PointType(1.0, 0.0), PointType(1.0, 1.0)
)
);
We can obtain the points of this square by a query such as:
SELECT points
FROM Polygons
WHERE name = 'square';
It is also possible to get a particular nested relation into the FROM clause by use of the keyword
THE, applied to a subquery whose result is a relation; the above query is an example, since it
returns a whole nested relation. For instance, the following query finds those points of the
polygon named square that are on the main diagonal (i.e., x=y).
SELECT ss.x
FROM THE(SELECT points
FROM Polygons
WHERE name = 'square'
) ss
WHERE ss.x = ss.y;
In this query, the nested relation is given an alias ss, which is used in the SELECT and WHERE
clauses as if it were any ordinary relation.
This document was written originally by Jeff Ullman for CS145 in the Autumn of 1998. Special thanks to Ian Mizrahi for the detective work on the
COLUMN_VALUE feature.
Web-Database Programming: CGI and Java
Servlets
NOTE: This document assumes a basic knowledge of HTML. We will not be providing
documentation for HTML coding apart from the creation of forms. There are dozens
of tutorials available online. You might check out the NCSA Beginner's Guide to
HTML.
• Overview
• Retrieving Input from the User
○ Forms
○ Server-Side Input Handling - CGI
○ Server-Side Input Handling - Java
• Returning Output to the User
○ CGI Output
○ Java Output
• Sample Code and Coding Tips
○ CGI Sample Code
○ CGI Setup
○ CGI Debugging
○ Java Sample Code
○ Java Compilation in Unix
○ Servlet Setup
○ Handling Special Characters
Overview
CGI or Common Gateway Interface is a means for providing server-side services over the web
by dynamically producing HTML documents, other kinds of documents, or performing other
computations in response to communication from the user. In this assignment, students who want
to interface with the Oracle database using Oracle's Pro*C precompiled language will be using
CGI.
Java Servlets are the Java solution for providing web-based services. They provide a very similar
interface for interacting with client queries and providing server responses. As such, discussion
of much of the input and output in terms of HTML will overlap. Students who plan to interface
with Oracle using JDBC will be working with Java Servlets.
Both CGI and Java Servlets interact with the user through HTML forms. CGI programs reside in
a special directory, or in our case, a special computer on the network (cgi-courses.stanford.edu),
and provide service through a regular web server. Java Servlets are separate network object
altogether, and you'll have to run a special Servlet program on a specific port on a Unix machine.
Forms
Forms are designated within an HTML document by the fill-out form tag:
<FORM METHOD = "POST" ACTION = "http://form.url.com/cgi-bin/cgiprogram">
... Contents of the form ...
</FORM>
The URL given after ACTION is the URL of the CGI program (your program). The METHOD is the
means of transferring data from the form to the CGI program. In this example, we have used the
"POST" method, which is the recommended method. There is another method called "GET", but
there are common problems associated with this method. Both will be discussed in the next
section.
Within the form you may have anything except another form. The tags used to create user
interface objects are INPUT, SELECT, and TEXTAREA.
The INPUT tag specifies a simple input interface:
<INPUT TYPE="text" NAME="thisinput" VALUE="default" SIZE=10 MAXLENGTH=20>
Bottom of Form
The different attributes are mostly self-explanatory. The TYPE is the variety of input object that
you are presenting. Valid types include "text", "password", "checkbox", "radio", "submit",
"reset", and "hidden". Every input but "submit" and "reset" has a NAME which will be associated
with the value returned in the input to the CGI program. This will not be visible to the user
(unless they read the HTML source). The other fields will be explained with the types:
• "text" - refers to a simple text entry field. The VALUE refers to the default text
within the text field, the SIZE represents the visual length of the field, and the
MAXLENGTH indicates the maximum number of characters the textfield will
allow. There are defaults to all of these (nothing, 20, unlimited).
• "password" - the same as a normal text entry field, but characters entered
are obscured.
• "checkbox" - refers to a toggle button that is independently either on or off.
The VALUE refers to the string sent to the CGI server when the button is
checked (unchecked boxes are disregarded). The default value is "on".
• "radio" - refers to a toggle button that may be grouped with other toggle
buttons such that only one in the group can be on. It's essentially the same
as the checkbox, but any radio button with the same NAME attribute will be
grouped with this one.
• "submit" and "reset" - these are the pushbuttons on the bottom of most
forms you'll see that submit the form or clear it. These are not required to
have a NAME, and the VALUE refers to the label on the button. The default
names are "Submit Query" and "Reset" respectively.
• "hidden" - this input is invisible as far as the user interface is concerned
(though don't be fooled into thinking this is some kind of security feature --
it's easy to find "hidden" fields by perusing a document source or examining
the URL for a GET method). It simply creates an attribute/value binding
without need for user action that gets passed transparently along when the
form is submitted.
The second type of interface is the SELECT interface, which includes popup menus and scrolling
tables. Here are examples of both:
<SELECT NAME="menu">
<OPTION>option 1
<OPTION>option 2
<OPTION>option 3
<OPTION SELECTED>option 4
<OPTION>option 5
<OPTION>option 6
<OPTION>option 7
</SELECT>
Bottom of Form
The SIZE attribute determines whether it is a menu or a scrolled list. If it is 1 or it is absent, the
default is a popup menu. If it is greater than 1, then you will see a scrolled list with SIZE
elements. The MULTIPLE option, which forces the select to be a scrolled list, signifies that a more
than one value may be selected (by default only one value can be selected in a scrolled list).
OPTION is more or less self-explanatory -- it gives the names and values of each field in the menu
or scrolled table, and you can specify which are SELECTED by default.
The final type of interface is the TEXTAREA interface:
<TEXTAREA NAME="area" ROWS=5 COLS=30>
Mary had a little lamb.
A little lamb?
A little lamb!
Mary had a little lamb.
It's fleece was white as snow.
</TEXTAREA>
Top of Form
Mary had a little lamb.
A little lamb?
A little lamb!
Mary had a little lamb.
It's fleece w as w hite as snow .
Submit Query
Bottom of Form
As usual, the NAME is the symbolic reference to which the input will be bound when submitted to
the CGI program. The ROWS and COLS values are the visible size of the field. Any number of
characters can be entered into a text area.
The default text of the text area is entered between the tags. Whitespace is supposedly respected
(as between <PRE> HTML tags), including the newline after the first tag and before the last tag.
Server-Side Input Handling -- CGI
The form contents will be assembled into an encoded query string. Using the GET
method, this string is available in the environment variable QUERY_STRING. It is
actually passed to the program through the URL -- examine the URL for the first of
the forms above:
http://asdf.asdf.asdf/asdf?thisinput=default&thisbox=on&radio1=2
Everything after the '?' is the query string. You'll see that a number of expressions
appear concatenated by & symbols -- each expression assigns a string value to
each form object. In this case, the text field named "thisinput" has the value
"default", which is what was typed into the field, the checkbox "thisbox" has the
value "on", and the radio button group "radio1" has the value "2" (the second
button is checked -- note that this is the value I gave it, not a default value. The
default is "on").
value = getenv("CONTENT_LENGTH");
sscanf(value, "%d", &length);
Decoding the data is thus just a question of walking through the input and picking out the values.
These values can then be used to determine what the user wants to see.
We have written a very simple, linear-search-based mechanism for parsing the input string.
These are located, as mentioned above, at cgiparse.c. You might want to cut and paste these into
your own code or to use the .h file provided. You can use this in your CGI programs by calling
Initialize() at the beginning of your code, and then calling GetFirstValue(key) and
GetNextValue(key) to retreive the bindings for each of the FORM parameters. See the
comments in the file for more details.
Server-Side Input Handling -- Java
Java handles GET and POST slightly differently. The parsing of the input is done for you by Java,
so you are separated from the actual format of the input data completely. Your program will be
an object subclassed off of HttpServlet, the generalized Java Servlet class for handling web
services.
Servlet programs must override the doGet() or doPost() messages, which are methods that are
executed in response to the client. There are two arguments to these methods,
HttpServletRequest request and HttpServletResponse response. Let's take a look at a
very simple servlet program, the traditional HelloWorld (this time with a doGet method):
import java.io.*;
import java.text.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;
if (param != null)
out.println("Thanks for the lovely param='" + param + "' binding.");
out.println("");
out.println("");
}
}
We'll discuss points in this code again in the section on Java Output, but for now, we will focus
on the input side. The argument HttpServletRequest request represents the client request,
and the values of the parameters passed from the HTML FORM can be retrieved by calling the
HttpServletRequest getParameter method. This method takes as its argument the name of
the parameter (the name of the HTML INPUT object), and returns as a Java String the value
assigned to the parameter. In cases where the parameter may have multiple bindings, the method
getParameterValues can be used to retrieve the values in an array of Java Strings -- note that
getParameter will return the first value of this array. It is through these mechanisms that you
can retrieve any of the values entered or implicit in the form.
As might be inferred from the example above, Java returns null if the parameter for whose name
you request does not have a value. Recall that unchecked buttons' bindings are not passed in a
POST message -- you can check for null to determine when buttons are off.
CGI Output
The only work you have to do apart from constructing an HTML document on the fly with the
output from the query is to add a short header at the top of the file. Your header will represent
the MIME type for HTML, and consists of a single line of text followed by a blank line:
content-type: text/html
Java Output
Let's look back at our Java code example. You'll see a number of differences between the Servlet
code and the CGI approach. Output is all handled by the HttpServletResponse object, which
allows you to set the content type through the setContentType method. Instead of printing the
HTTP header yourself, you tell the HttpServletResponse object that you want the content type
to be "text/html" explicitly.
All HTML is returned to the user through a PrintWriter object, that is retrieved from the
response object using the getWriter method. HTML code is then returned line by line using
the println method.
Assuming that you all have a basic background in Java, so we won't provide a detailed treatment
of exceptions here, but do note that IOException and ServletException both must either be
handled or thrown.
CGI Setup
Your CGI script will be run from cgi-courses.stanford.edu. The URL for your CGI executable
will be: http://cgi-courses.stanford.edu/~username/cgi-bin/scriptname
You will need to perform the following actions before a CGI program will run:
• Get an account on cgi-courses.stanford.edu.
• Create a directory in your home folder to hold your cgi binary executables:
mkdir cgi-bin
• Set access levels on your cgi-bin directory for your cgi-courses.stanford.edu
account ("username" should be replaced with your username):
fs setacl cgi-bin username.cgi write
• Make sure that your executeables correctly set all environment variables
(normally this is done by /usr/class/cs145/all.env, but this is not available on
the cgi machine, so you have to do it explicitly). Here is an example function
that you should run before attempting to connect to the database (this
function is in C, but you can pretty much just lift the settings and paste them
into Perl or PHP, which also need them to connect to the database):
• void SetEnvs(void) {
• putenv("ORACLE_SID=SHR1_PRD");
• putenv("ORACLE_HOME=/usr/pubsw/apps/oracle/8.1.7");
• putenv("ORACLE_TERM=xsun5");
• putenv("TNS_ADMIN=/usr/class/cs145/sqlnet");
• putenv("TWO_TASK=SHR1_PRD");
}
• Move the executable into your cgi-bin folder.
• Change the permissions on your cgi executable:
chmod 701 scriptname
• Use HTML forms to access your new program at http://cgi-
courses.stanford.edu/~username/cgi-bin/scriptname.
Here is the homepage of the leland CGI service, which has a FAQ and gives some information
about the capabilities of the system. Please check here first if your CGI programs are giving you
errors.
CGI Debugging
Due to popular demand, a new cgi debugging feature was just added to the cgi service. It's not in
the leland CGI docs yet. If you access your script like so:
http://cgi-courses/cgi-bin/sboxd/~username/scriptname
The script will execute with extra debug info:
• All STDERR goes to the browser
• A header is included, so lack of any output or lack of Content Type will not
cause Internal Server Error.
If still receiving Internal Server Error, consult the cgi FAQ or look in the server log:
http://cgi-courses/logs/error_log.
Note, the log shows only several recent entries, due to system issues.
An alternative method is to run your cgi program from command-line, without using the web
browser. Put your CGI input into the environment variable QUERY_STRING and run your
program. For example (assuming your program is called cgiprog and expects two parameters
name1 and name2):
cd ~/cgi-bin
setenv QUERY_STRING 'name1=abc&name2=def'
cgiprog
Note: If you want to use debugging tools such as dbx or gdb, you need to modify Makefile to
add the flag -g after cc or g++.
You also have to set up a specific directory structure to provide Servlets. The directory structure
required by Servlets is essentially:
[anydir]
[servletdir]
webpages
WEB-INF
servlets
A shell script to build this hierarchy is provided at
/afs/ir/class/cs145/code/bin/buildServletDirectory (after you run source
/afs/ir/class/cs145/all.env (which you probably should just add to your .cshrc file), you
can run buildServletDirectory by just typing the command).
You can store .html documents in your webpages directory, and they will be accessible at your
Servlet address (see below), while all Servlets you write have to be located in the servlets
directory to be recognized.
Further information on the Java Servlet API can be found at Servlet Package Documentation
page.
Servlet Setup
The directory structure for your servlets and HTML documents was outlined in the previous
section. Static HTML documents may be placed in the webpages directory and are accessible
from the web at the address http://machinexx:portnum/page.html, where machinexx refers
to the machine from which you're running the webserver (e.g. elaine12, saga22, myth7, etc.),
portnum is a specific port (see below), and page.html is the name of the HTML page that you
are serving. You may find it useful to create a static HTML document or a hierarchy of static
documents to serve as the jumping off point for your Servlets, where your HTML FORMs that
start the interaction with the database are found.
Servlets will be found in the directory servletdir/webpages/WEB-INF/servlets, and will just be
the .class files that you compile from your .java files using javac. These may be reached on the
web using the URL http://machinexx:portnum/servlet/servletname. Note that the servlet
directory is singular in the URL but plural in Unix, while the Servlet itself loses its .class in the
URL. HTML and other documents contained in the servlets directory cannot be accessed over
the web.
Once you have your directory set up and your Servlets compiled, you have to run the Java JSDK
2.1 webserver manually on a specific leland machine in order to provide these documents over
the web. The steps involved in starting the server are as follows:
• Choose a port number in the range 5000-65000. This will bind your server
application to that port for the machine on which you're running your server.
Try to choose a random number and remember it -- you will be the only
person on that machine who can use that port, and you will need it to have
access over the web.
• From the root of your servlet directory (if you run our buildServletDirectory
script, then it will be called servletdir), start the server by calling startserver
-port portnum from the Unix command line, where portnum is the port
number you chose above. The server will begin in the background, and you
can see it using the ps command. If you do not enter a port number, the
default port number, 8080, will be chosen for you (you can actually set the
default yourself -- after you've run the server once, it will create a
configuration file called "default.cfg" for you -- it finds the default port
number here).
• From your browser, enter the URL of a webpage or servlet contained in your
servletdir hierarchy using the address structure mentioned above. Now you
can play with your interface.
• If you would like to stop the server, issue the command stopserver.
• If you want to recompile your servlets, you have to stop the server and
restart it again. Static HTML pages that you are hosting from the webpages
directory, however, can be changed at will.
http://cgi-courses.stanford.edu/~username/cgi-bin/cgiprog?p1=3&p2=M%26M
Be careful not to confuse the escape strings for HTML text with those for URL's.
This document was written by Nathan Folkert (with help from Vincent Chu) for Prof. Jennifer Widom's CS145 class in Spring
2000; revised by Calvin Yang for Prof. Widom's CS145 class in Spring 2002.
DBMS_OUTPUT is very useful for debugging PL/SQL programs. However, if you print too much,
the output buffer will overflow (the default buffer size is 2KB). In that case, you can set the
buffer size to a larger value, e.g.:
BEGIN
DBMS_OUTPUT.ENABLE(10000);
nothing;
END;
.
RUN;
What is the correct syntax for ordering query results by row-type objects?
As a concrete example, suppose we have defined an object type PersonType with an
ORDER MEMBER FUNCTION, and we have created a table Person of PersonType objects.
Suppose you want to list all PersonType objects in Person in order. You'd probably
expect the following to work:
As a general precautionary measure, please be sure to test your queries under sqlplus prompt
before running them through CGI or JDBC. It is much easier to kill a query in sqlplus than in
CGI or JDBC. If your test query takes a long time to run under sqlplus, you can simply hit Ctrl-
C to terminate it.
Never close an ssh or telnet or xterm window without properly logging out. Always quit your
programs (including sqlplus), stop Java servlets, and type "exit" or "logout" to quit. If you force-
close your ssh/telnet/xterm window, there may still be processes running in the background, and
you may be taking up system resources without knowing it.
If, for some reason, you cannot logout normally (for example, the system is not responding), you
should open another window, login to the same machine where you have the problem, and kill
the processes that is causing trouble:
Type "ps -aef | grep [username]" to find the Process IDs of your processes (replace
[username] with your leland user name), and kill the processes you want to
terminate using "kill [processID]". Always use the "kill" command without the -9
flag first. Use -9 flag only if you cannot kill it otherwise.
If you closed the window by mistake and do not remember which sweet hall
machine you were logged into, open another window immediately and log into any
sweet hall machine, then type "sweetfinger [username]" (replace [username] with
your actual leland user name). It will give you the machine names you were on a
few minutes ago. Then, log in to the appropriate machine and kill your processes
there.
If you issued a query through JDBC that is taking a long time to execute and you want to kill it,
you should stop your Java servlet. In most cases this will kill the query. You can also use the
setQueryTimeout([time in seconds]) method on a statement object to stop queries that run too
long.
If you issued a query through CGI that is taking a long time to execute, normally the CGI service
will kill it for you within 10 seconds. However, the above occasionally fails to work, and we do
not know of any better way of killing runaway queries issued by JDBC or CGI (other than asking
the administrator to kill them for you). That's why we ask you to always test your queries under
sqlplus first. It is much easier to kill queries there.
This document was written originally by Jun Yang for CS145 in Spring, 1999. Additions by Antonios Hondroulis and Calvin
Yang in Spring, 2002
Rarely is the full behavior of the NULL value in SQL taught or described in detail, and with
good reason: Some of the SQL rules surrounding NULL can be surprising or unintuitive.
Unfortunately, if you have deal with NULL in real databases, the results can be downright
frustrating. The SQLite project, for example, uses trial and error to determine how a database
behaves in the presence of NULL values.
Fortunately, Date and Darwen's A Guide to the SQL Standard (fourth edition) [1] describes
SQL's rules concerning NULL in good detail.
NULL Basics
Intuitively, NULL approximately represents an unknown value.
• An arithmetic operation involving a NULL returns NULL. For example, NULL
minus NULL yields NULL, not zero. [2]
• A boolean comparison between two values involving a NULL returns neither
true nor false, but unknown in SQL's three-valued logic. [3] For example,
neither NULL equals NULL nor NULL not-equals NULL is true. Testing whether
a value is NULL requires an expression such as IS NULL or IS NOT NULL.
• An SQL query selects only values whose WHERE expression evaluates to true,
and groups whose HAVING clause evaluates to true.
• The aggregate COUNT(*) counts all NULL and non-NULL tuples;
COUNT(attribute) counts all tuples whose attribute value is not NULL. Other
SQL aggregate functions ignore NULL values in their computation. [4]
A Simple Case
Here is what the SQL standard mandates for some operations involving sets and multisets.
For a simple relation R
CREATE TABLE R (a INTEGER);
the following queries attempt to reliably determine the maximum known value of
the attribute a in the table R.
a >= ALL()
This expression of the maximum seems consistent with mathematical logic, but fails completely
in SQL:
SELECT DISTINCT a
FROM R
WHERE a >= ALL (SELECT * FROM R)
• If R is empty, the query returns empty. The >= ALL test is vacuously true with
an empty subquery [6], but there is no value of a to exploit the test.
• If R holds a NULL value, the query returns empty, because the test a >=
ALL(...) returns unknown (not false!) for any NULL or maximum non-NULL
integer value of a if the subquery includes a NULL value. [7]
EXCEPT
This expression is one derivation of maximum as computed in relational algebra: subtract all the
non-maximum values from the table R, leaving the maximal ones:
(SELECT DISTINCT * FROM R)
EXCEPT
(SELECT R.a
FROM R, R AS S
WHERE R.a < S.a)
• If R is empty, the query returns empty.
• If R holds a NULL value, the query returns NULL, in addition to whatever
maximal integer is present (if any). The lower subquery never includes NULL,
so NULL is never subtracted from R.
NOT IN
This expression is another writing of maximum as computed in relational algebra: find values not
in the non-maximum values of R:
SELECT DISTINCT *
FROM R
WHERE a NOT IN (SELECT R.a
FROM R, R AS S
WHERE R.a < S.a)
This writing turns out to be subtly different from the last one.
• If R is empty, the query returns empty.
• If R holds one integer, and at least one NULL value, the query returns NULL, in
addition to whatever maximal integer is present (if any). In this case, the
subquery is always empty; the one available integer is compared only to
NULL values, so do not participate in the subquery's result. NULL NOT IN
(empty) is vacuously true as it is in mathematics [8], so NULL is selected as
part of the result.
• If R holds more than one integer, the query returns the maximal integer. In
this case, the subquery includes at least one value. Now that the subquery is
not empty, NULL NOT IN (nonempty result) evaluates to unknown (not false!),
and is no longer selected as part of the result. As an aside, NULL NOT IN
(nonempty result) returns unknown even if the nonempty result includes
NULL. [9]
EXCEPT NULL
Because it is somewhat awkward to have an expression for MAX return two rows whose values do
not equal, the following expression adjusts the EXCEPT expression to exclude NULL from the
answer:
(SELECT DISTINCT * FROM R)
EXCEPT
(SELECT R.a
FROM R, R AS S
WHERE R.a < S.a OR R.a IS NULL)
• If R is empty, the query returns empty.
• If R holds NULL, the query returns the maximal integer, or empty if R has no
integers. EXCEPT will remove NULL from the result if NULL appears in the
bottom subquery, even though NULL is not equal to NULL. [10] Similarly,
DISTINCT, UNION, and INTERSECT always returns at most one NULL.
Conclusion
No pair of the queries from the list above are equivalent when faced with NULLs in relational
data, despite their conceptual similarity.
The two implementations tested, PostgreSQL and Oracle, seem to comply with NULL behavior
for the set and multiset operations tested here, even when such behavior is sometimes subtle or
unintuitive.
Consider the above a good reason to define away NULLs from relational schema whenever
possible.
[1] C. J. Date and Hugh Darwen A Guide to the SQL Standard. Fourth edition,
Addison-Wesley, Reading, Massachusetts, 1997. (ISBN 0-201-96426-0)
Ajayajayajayajay
• database normalization
• 1nf
• 2nf
• 3nf
• bcnf
Sponsored Links
SQL Backup ManagerSecure Online Backup for Businesses From MozyPro. Sign Up Now & Save!www.Mozy.com
Master Data SearchMatchMaker brings fast fuzzy search to Master Data Management solutionswww.exorbyte.com
The CAIA DesignationThe only global credential for alternative investment specialists.www.caia.org
Database Ads
If you've been working with databases for a while, chances are you've heard the term normalization. Perhaps someone's asked you
"Is that database normalized?" or "Is that in BCNF?" All too often, the reply is "Uh, yeah." Normalization is often brushed aside as a
luxury that only academics have time for. However, knowing the principles of normalization and applying them to your daily
database design tasks really isn't all that complicated and it could drastically improve the performance of your DBMS.
In this article, we'll introduce the concept of normalization and take a brief look at the most common normal forms. Future articles
will provide in-depth explorations of the normalization process.
What is Normalization?
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process:
eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make
sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database
consumes and ensure that data is logically stored.
The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as
normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five
(fifth normal form or 5NF). In practical applications, you'll often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal
form is very rarely seen and won't be discussed in this article.
Before we begin our discussion of the normal forms, it's important to point out that they are guidelines and guidelines only.
Occasionally, it becomes necessary to stray from them to meet practical business requirements. However, when variations take
place, it's extremely important to evaluate any possible ramifications they could have on your system and account for possible
inconsistencies. That said, let's explore the normal forms.
First normal form (1NF) sets the very basic rules for an organized database:
• Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary
key).
Second normal form (2NF) further addresses the concept of removing duplicative data:
• Create relationships between these new tables and their predecessors through the use of foreign keys.
If you'd like to ensure your database is normalized, explore our other articles in this series
COLUMN: Definition: Database tables are composed of individual columns corresponding to the attributes of the object.
ROW: Definition: In a relational database, a row consists of one set of attributes (or one tuple) corresponding to one instance of
the entity that a table schema describes.
PRIMARY KEY: Definition: The primary key of a relational table uniquely identifies each record in the table. It can either be a
normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no more than one record per
person) or it can be generated by the DBMS (such as a globally unique identifier, or GUID, in Microsoft SQL Server). Primary keys
may consist of a single attribute or multiple attributes in combination.
Examples:
Imagine we have a STUDENTS table that contains a record for each student at a university. The student's unique student ID
number would be a good choice for a primary key in the STUDENTS table. The student's first and last name would not be a good
choice, as there is always the chance that more than one student might have the same name.
For more information on keys, read the article Database Keys. For more on selecting appropriate primary keys for a table, read
Choosing a Primary Key.
DATA BASE:
As you may already know, databases use tables to organize information. (If you don’t have a basic familiarity with database
concepts, read What is a Database?) Each table consists of a number of rows, each of which corresponds to a single database
record. So, how do databases keep all of these records straight? It’s through the use of keys.
Primary Keys
The first type of key we’ll discuss is the primary key. Every database table should have one or more columns designated as the
primary key. The value this key holds should be unique for each record in the database. For example, assume we have a table
called Employees that contains personnel information for every employee in our firm. We’d need to select an appropriate primary
key that would uniquely identify each employee. Your first thought might be to use the employee’s name.
This wouldn’t work out very well because it’s conceivable that you’d hire two employees with the same name. A better choice might
be to use a unique employee ID number that you assign to each employee when they’re hired. Some organizations choose to use
Social Security Numbers (or similar government identifiers) for this task because each employee already has one and they’re
guaranteed to be unique. However, the use of Social Security Numbers for this purpose is highly controversial due to privacy
concerns. (If you work for a government organization, the use of a Social Security Number may even be illegal under the Privacy
Act of 1974.) For this reason, most organizations have shifted to the use of unique identifiers (employee ID, student ID, etc.) that
don’t share these privacy concerns.
Once you decide upon a primary key and set it up in the database, the database management system will enforce the uniqueness
of the key. If you try to insert a record into a table with a primary key that duplicates an existing record, the insert will fail.
Most databases are also capable of generating their own primary keys. Microsoft Access, for example, may be configured to use the
AutoNumber data type to assign a unique ID to each record in the table. While effective, this is a bad design practice because it
leaves you with a meaningless value in each record in the table. Why not use that space to store something useful?
Foreign Keys
The other type of key that we’ll discuss in this course is the foreign key. These keys are used to create relationships between
tables. Natural relationships exist between tables in most database structures. Returning to our employees database, let’s imagine
that we wanted to add a table containing departmental information to the database. This new table might be called Departments
and would contain a large amount of information about the department as a whole. We’d also want to include information about the
employees in the department, but it would be redundant to have the same information in two tables (Employees and Departments).
Instead, we can create a relationship between the two tables.
Let’s assume that the Departments table uses the Department Name column as the primary key. To create a relationship between
the two tables, we add a new column to the Employees table called Department. We then fill in the name of the department to
which each employee belongs. We also inform the database management system that the Department column in the Employees
table is a foreign key that references the Departments table. The database will then enforce referential integrity by ensuring that all
of the values in the Departments column of the Employees table have corresponding entries in the Departments table.
Note that there is no uniqueness constraint for a foreign key. We may (and most likely do!) have more than one employee
belonging to a single department. Similarly, there’s no requirement that an entry in the Departments table have any corresponding
entry in the Employees table. It is possible that we’d have a department with no employees
Candidate Key
Definition: A candidate key is a combination of attributes that can be uniquely used to identify a database record without any
extraneous data. ...
CHOOSING APRIMARY KEY: Databases depend upon keys to store, sort and compare records. If you’ve been around databases
for a while, you’ve probably heard about many different types of keys – primary keys, candidate keys, and foreign keys. When you
create a new database table, you’re asked to select one primary key that will uniquely identify records stored in that table.
The selection of a primary key is one of the most critical decisions you’ll make in the design of a new database. The most important
constraint is that you must ensure that the selected key is unique. If it’s possible that two records (past, present, or future) may
share the same value for an attribute, it’s a poor choice for a primary key. When evaluating this constraint, you should think
creatively. Let’s consider a few examples that caused issues for real-world databases:
• ZIP Codes do not make good primary keys for a table of towns. If you’re making a a simple lookup table of cities, ZIP code
seems to be a logical primary key. However, upon further investigation, you may realize that more than one town may share a
ZIP code. For example, four cities in New Jersey (Neptune, Neptune City, Tinton Falls and Wall Township) all share the ZIP code
07753.
• Social Security Numbers do not make good primary keys for a table of people for many reasons. First, most people consider their
SSN private and don’t want it used in databases in the first place. Second, some people don’t have SSNs – especially those who
have never set foot in the United States! Third, SSNs may be reused after an individual’s death. Finally, an individual may have
more than one SSN over a lifetime – the Social Security Administration will issue a new number in cases of fraud or identity
theft.
So, what makes a good primary key? If you’re unable to find an obvious answer, turn to your database system for support. A best
practice in database design is to use an internally generated primary key. The database management system can normally
generate a unique identifier that has no meaning outside of the database system. For example, you might use the Microsoft Access
AutoNumber datatype to create a field called RecordID. The AutoNumber datatype automatically increments the field each time you
create a new record. While the number itself is meaningless, it provides a great way to reference an individual record in queries.
Those are the basics on primary keys. Remember to choose carefully, as it’s difficult to change the primary key in a production
table. For a more in-depth look at all the types of database keys, read Database Keys.
DATABASE RELATIONSHIP” Definition: A relationship exists between two database tables when one table has a foreign key that
references the primary key of another table.