Professional Documents
Culture Documents
Introduction To Transaction SQL
Introduction To Transaction SQL
Introduction To Transaction SQL
How to use some of the specific features that make this language so unique and
powerful
What you can do with it
At about this time, Microsoft realized that they needed to break from IBM and build their own 32 bit
operating system, which they called Windows NT. It would include a symmetric multiprocessing
environment with asynchronous I/O. The new operating system would be multi-threaded and would
provide standard facilities for logging and management of the processes. Microsoft decided to bet the
farm on this new platform and moved to make SQL Server one of the first applications for it when
Windows NT was released to the public in July of 1993. Within 3 months Microsoft SQL Server 4.2 for NT
was shipping to the general public.
By the end of 1993, the relationship between Microsoft and Sybase had changed as the companys core
focus had evolved from startup to maturity in the high tech industry. The new Windows NT platform was
proving to be an alternative to the mini-computer UNIX based environment not only based on
performance, but also on ease of use. The last release of SQL Server 4.2 was the last time that Microsoft
and Sybase would share code. Since 1991 Microsoft had been slowly growing their stable of
development expertise to be able stand on their own supporting and evolving the SQL Server product. In
April of 1994, the companies announced the end of their relationship and each company went on to
develop the product line exclusively for their respective platforms.
By the end of 1994 Microsoft began putting together plans for the next version of SQL Server. Originally
called SQL95 but later released as Version 6.0 it would include support for scrollable cursors, database
replication, better performance, a greatly expanded library of system stored procedures, and the inclusion
of a better database management tool. For the first time, this new release would be developed entirely by
Microsoft. The SQL Server team at Microsoft delivered it on time and it was a great success in the
marketplace..
Version 6.5 came out 10 months later and included support for the Internet, data warehousing and
distributed transactions. Another key event was that it gained certification as ANSI SQL compliant. SQL
Server 7.0 brought us full row level locking, a new query processing component that became Microsoft
Data Engine (MSDE) that supports distributed queries and a more efficient scheme for processing ad-hoc
queries. SQL Server 2000 added user defined functions, new data types, and a host of other new
features we will cover later in this chapter.
The list of enhancements in SQL Server 2000 includes
With each release, SQL Server has expanded its functionality and improved its performance as an
enterprise level database engine. While there may be some who still try to deny it, Microsoft has
established themselves as a serious player in the DBMS industry and has built a core competency at
taking a good product and making it better.
various clauses to determine exactly which rows are affected. Taken together, these form what is referred
to as the Data Manipulation Language or DML.
14 aggregate, 23 string, 9 date and time, and 23 mathematical functions out of the box in addition to
several other system and state functions that we can use in developing our code. In addition to the many
built in functions, Transact SQL also provides a number of operators that are worth mention.
Distinct
Eliminating duplicate records from our rowset can be done within our query by specifying the DISTINCT
keyword in our SELECT clause. It will cause the query engine to return only unique rows from our query.
SELECT
DISTINCT Country
FROM Suppliers
Top
The TOP operator is Microsoft extension in Transact SQL that allows us to limit the number of rows we
want in our result set. After the query has been processed, the result set is limited to the number of rows
specified. We can either use a hard number like 10, or include the PERCENT keyword to indicate that we
want to see only a percentage of the full results set.
SELECT
TOP 10 ProductID, Sum(UnitPrice * Quantity) as Sales
FROM [Order Details]
GROUP BY ProductID
ORDER BY Sales
Trying to accomplish the same results on other database platforms requires much more work and
creativity on the part of the developer!
Subqueries
A subquery is a complete select
statement that returns a single value and is nested within another SQL statement in place of an
expression. We can use the subquery anywhere that we can use an expression, whether it is part of the
SELECT
clause or WHERE clause. We can use the subquery to perform lookups on related tables, or to evaluate
a condition. We might need to select the orphaned rows which are those records from the child table
which have no parent record. In our Northwind example we need to reduce our supplier count and
eliminate suppliers who have no product sales. One way to return the list of suppliers with no orders using
subqueries is:
SELECT
SupplierID, CompanyName FROM Suppliers S
WHERE NOT EXISTS
(SELECT
'x' FROM [Order Details] D, Products P
WHERE D.ProductID = P.ProductID
AND S.SupplierID = P.SupplierID)
String Concatenation
One of the basic operations performed in the database is to generate human-readable output. Because
the process of normalization will break apart things like addresses and names into multiple parts that
need to be reassembled, string concatenation is an important function to understand. SQL Server's
approach is to use the + operator between character based values to concatenate. For example, if we
wanted to do a report in which we wanted to return a single column with our Employee's name we could
use the following:
SELECT
FirstName + ' ' + LastName FROM Employees
We could, for example, write a SQL Script where the output is another SQL Script. This generates a script
which counts the user tables in the current database.
SELECT
'SELECT
''' + name + ''', count(*) from [' + name + ']'
FROM sysobjects
WHERE type = 'U'
ORDER BY name
One note to developers is that this syntax is specific to SQL Server and doesn't work on other databases
such as Oracle (use two pipe characters || to concatenate two strings). Also you may notice the square
brackets ('[' and ']') in the code. These are necessary when the database schema contains non-standard
schema. In the Northwind database the Order Detail table's name has a space in between Order and
Detail. In order to reference non-standard names we simply surround it with the brackets.
Comments
A quick note on comments. Just as all good code is self-documenting, we may find at times that it is
helpful to others who read our code to find comments from the author explaining what is going on.
Transact SQL recognizes two forms of comments: Single-line and multi-line.
The -- operator indicates that from that point on the line is a comment text and should be ignored by the
query engine. In Query Analyzer, the commented text is colored as a comment (green by default).
SELECT
GetDate() -- This is a comment from here on, I can type whatever I want
Multi-line comments follow the C++ syntax of /* comment text */, where the comment text can cover
multiple lines of text.
/* This is a multi-line comment, none of the statements are ever processed
DECLARE @myDate DATETIME
SET @myDate = GetDate()
PRINT 'The current date is ' & convert(varchar(18), @MyDate, 110)
*/
2004-03-16. If you are exporting data to a comma separated value file to be loaded into Oracle, you
need to use the convert function as:
Select
convert(varchar(12), mydate, 113)
The first argument is the target datatype, the second is the source value, and the third is an optional style
value. The convert statement is also useful if you are transforming data from oen schema to another and
need to concatenate columns but need to restrict the resulting size.
Select
convert(varchar(35), RTRIM(FNAME) + ' ' + RTRIM LNAME))
Another use is if you are working with text or binary data that is larger than the 8000 character limit of the
varchar datatype. SQL Server doesnt allow these types of columns to be used for comparison or sorting
unless you use the LIKE operator. If you had a table with a column called CODE_TAG that was defined
as TEXT, and you wanted to select
all records where it equals BO you would have to use the following syntax:
select
* from codes where convert(varchar(2), CODE_TAG) = BO
Case.
The CASE expression allows you to evaluate a variable or a condition inline and return one of multiple
values as the SQL statement is being run. The basic version takes a source variable or column and
depending on the value returns one of a discrete set of values.
Select
case datepart(month, GetDate())
When 1 then 'January is nice'
When 2 then 'February is cold'
When 3 then 'Spring is coming'
Else 'Its not winter anymore'
End
You can also evaluate Boolean conditions. These are evaluated in order until the conditions are satisfied
and the resulting value is returned.
Select
case
when datepart(month, GetDate()) > 9
then 'Winter is here'
when datepart(hour, GetDate()) > 22
then 'Its getting late'
when convert(varchar(12), getdate(), 113) = '28 NOV 2003'
then 'Everythings on SALE!'
when datepart(weekday, GetDate()) = 6
then 'TGIF!'
Else 'Who knows???'
Case can be used anywhere an expression can, including defining views that dynamically transform data
based on current conditions. For example we could use this technique to implement a 10 hour sale on all
products for the day after Thanksgiving in the Northwind database by creating a view called PriceLookUp
as follows:
CREATE VIEW PriceLookUp AS
Select
ProductID, ProductName, CategoryID, UnitPrice,
Case when getdate() > '28-NOV-2003 08:00' and
getdate() < '28-NOV-2003 18:00'
then UnitPrice * .85
else UnitPrice
end as SalePrice
from Northwind..Products
go
select
GetDate(), UnitPrice, SalePrice, * from PriceLookUp
We could have written a batch and scheduled it to run at the appropriate times to update the sale price,
but that wouldve added a dependency on the scheduler, the amount of time it would take to process all
the rows (mature retailer product tables likely have hundreds of thousands of rows in them).
Joining tables
As the name "Relational" implies, information in well-designed databases is stored in more than a single
table, and much of our work is to develop means for accessing it effectively. When more than one table is
referenced in a query, they are said to be joined. A join is specific criteria which define how two or more
tables are related. Joins come in two flavors, inner and outer. The basic difference is that inner joins will
only return rows in which both tables have columns matching the join criteria.
Inner
Join
Left Outer Join
Suppliers
Supplier ID primary key
CmpyName info col
Country info col
Products
ProdID primary key
ProdName info column
SupplierID foreign key
Units info column
Right Outer Join
An inner join is defined as the union of the set of records in which the SupplierID column in the Suppliers
table has records which match Products SupplierID column. Records must exist in both tables in order for
it to be part of the resulting set of rows returned. The join criteria can be specified as part of the FROM
clause or as part of the WHERE. The two following statements return identical results:
SupplierID
1
2
3
CmpyName
ABC
Widget
Sprocket
4U
Country
Isa Cmpy
CA
USA
UK
ProdID
1
2
3
4
ProdName
WidgOne
WidgTwo
Spockit
WidgThree
SupplierID
1
1
2
1
Units
4
6
3
1
*
FROM Suppliers as S JOIN Products as P ON S.SupplierID = P.SupplierID
WHERE P.Units > 2
Running these statements will result in the selection of rows from both tables which meet the join criteria
and will exclude rows which do not have a match. In our resultset we won't return Supplier 3 Isa Cmpy
because they don't have any products, and we don't return WidgThree from the Product table because it's
Units are less than the selection criteria.
SupplierID
1
1
2
CmpyName
ABC
Widget
ABC
Widget
Sprocket
4U
Country
USA
ProdID
1
ProdName
WidgOne
SupplierID
1
Units
4
USA
WidgTwo
UK
Spockit
A left join (also called a left outer join) would return rows from the first table and information from the other
if it exists. If it doesn't then nulls are returned in place of the missing data. The syntax for this would be:
-- SQL:89 Left Outer Join Syntax
SELECT
*
FROM Suppliers S, Products P
WHERE S.SupplierID *= P.ProductID
AND P.Units > 2
-- SQL:92 Left Outer Join Syntax
SELECT
*
FROM Suppliers as S LEFT JOIN Products as P ON S.SupplierID = P.SupplierID
WHERE P.Units > 2
SupplierID
1
1
2
3
CmpyName
ABC
Widget
ABC
Widget
Sprocket
4U
Country
USA
ProdID
1
ProdName
WidgOne
SupplierID
1
Units
4
USA
WidgTwo
UK
Spockit
Ima Cmpy
CA
NULL
NULL
NULL
NULL
A Right Outer Join is the converse of the Left Outer Join except that the table on the Right is the driver.
Two other types of joins can be made, although you don't see them too often. The first is a Full Outer Join
in which all rows from both tables are returned, and Cross Joins (also called a Cartesian Join) in which
every possible combination of both tables is returned. If you forget to add the WHERE clause and you've
specified more than one table, you are running a Cross Join.
Select
* from Suppliers, Products
Our example would return 12 rows (3 rows in Suppliers x 4 rows in Products = 12 combinations of the
two), but if we were using a more realistic scenario involving hundreds of suppliers and thousands of
products would result in hundreds of thousands of combinations.
SQL Server supports both types of syntax, but not all databases do. In order to ensure portability of your
code, you may find that the older syntax easier to work with. In the ANSI SQL 92 standard, the join
criteria is recommended to be part of the FROM clause in order to avoid confusion with the filter criteria of
the WHERE clause.
Union
A union is the combination into one record set the results of more than one query. The only requirement
is that all the queries have the same set of columns being returned and in the same order. For example, if
you wanted to return the names of tables and the number of rows in each we would use the following
syntax:
select
'Categories' as TAB, count(*) as CNT from Categories UNION
select
'CustomerCustomerDemo', count(*) from CustomerCustomerDemo UNION
select
'CustomerDemographics', count(*) from CustomerDemographics UNION
select
'Customers', count(*) from Customers UNION
select
'Employees', count(*) from Employees UNION
select
'EmployeeTerritories', count(*) from EmployeeTerritories UNION
select
'Order Details', count(*) from Order Details UNION
select
'Orders', count(*) from Orders UNION
select
'Products', count(*) from Products UNION
select
'Region', count(*) from Region UNION
select
'ReplenishWork', count(*) from ReplenishWork UNION
select
'Shippers', count(*) from Shippers UNION
select
'Suppliers', count(*) from Suppliers UNION
select
'Territories', count(*) from Territories
The column header names are taken from the first resultset that the query engine sees, in this case TAB
and CNT.
Insert
We use the INSERT statement to add new information to the database. The standard SQL statement for
inserting a new row is:
INSERT INTO (, , ...) values (, , ...)
You can omit the column list only if you have a value for every column in the row.
Adding a new row to the Supplier table would look like this:
INSERT INTO Suppliers (SupplierID, CmpyName, Country)
Values (3, 'Isa Cmpy', 'CA')
SQL Server provides us with another form of the INSERT statement in which we can
use an existing table as the source of the new records. In this case we use the syntax:
INSERT INTO (, , ...)
SELECT
, , ... FROM WHERE ...
For example, supposing we were implementing a reporting database (also called a data
warehouse) in which we wanted to perform aggregation and summarization on our
vendors. We could define a new table to contain our calculations and provide some
level of isolation from the transactional system. Our new table might look like this:
ReplenishWork
SupplierID
CmpyName
ProductsOutOfStock
We can run a SQL batch to load the table using the second form of the INSERT statement as follows:
INSERT INTO ReplenishWork (SupplierID, CmpyName, ProductsOutOfStock)
SELECT
P.SupplierID, S.CmpyName, Count(*)
FROM Suppliers as S INNER JOIN Products as P ON S.SupplierID = P.SupplierID
WHERE P.Units < 2
GROUP BY P.SupplierID, S.CmpyName
Update
We manage existing information in the database using the UPDATE statement. SQL Server allows for the
basic update statement that looks like this:
UPDATE SET = , =
WHERE ...
But it also includes a variation in which you can update one table using another table as
your source. This syntax is an extension Microsoft includes to the ANSI standard that
won't necessarily run in other environments.
UPDATE ReplenishWork SET ProductsOutOfStock = P.UnitsInStock
FROM Products as P
Delete
Purging the database of unnecessary information, or to enforce relational integrity, we
use the DELETE function. The basic syntax is:
DELETE FROM
WHERE ...
The WHERE clause determines which rows will be affected. A word of caution when
you are deleting data, it takes a long time to type it back in so you want to be sure you
check what you are getting rid of. You can do this by running a count of the rows that
will be affected (select
count(*) from
WHERE) before executing the delete. You can further protect yourself from unwanted aggravation by
using Transactions.
Variables
Variables are used to temporarily hold values that may be useful in the processing of
statements. We declare them using the DECLARE keyword, specifying the name and
the type, and we assign a value to them with the SET statement. You can also assign a
value as part of a SELECT
statement, but using SET is listed as the preferred method.
We reference them as we would column data or constants in the processing logic of our
code using the @ character in front of the name to denote a variable. Global variables
reference the status of the system and the database engine and other options of the
server and are referenced with two @@ characters.
All user declared variables are local in scope, meaning they remain accessible until the
end of the current batch of statements is complete. For example executing the following
script would result in an error because the variable @MyVar is not declared within the
batch in which it is referenced.
DECLARE @MyVar varchar(15)
SET @MyVar = 'New Value'
Go
PRINT 'The value of MyVar is ' + @MyVar
Batches
Batches are a grouping of one or more SQL statements that together form a logical unit
of work. They are sent together to the SQL Server engine, which creates an execution
plan. If errors are generated in creating the plan, none of the statements are actually
executed. Errors that occur after the execution plan has been compiled will cause
succeeding statements to fail, but will not affect previous logic in the batch. This is true
unless the batch is part of a transaction in which all of the statements are rolled back
after the error.
For example, the following Transact SQL Code is comprised of 3 statements, which are
all part of the same batch:
DECLARE @Discount float
SET @Discount = 0.05
SELECT
ProductID, UnitPrice, UnitPrice - @Discount * UnitPrice as SalePrice
FROM Products
ORDER BY SalePrice Desc
GO
Because the @Discount variable is used in the successive commands, if you try to run
either of the last two statements separately you will raise an error condition.
Creating objects such as views, defaults, procedures, rules and triggers cannot be
combined with other statements in a batch. A batch is completed by the inclusion of the
command GO. An important point to remember about GO is that it is not a Transact
SQL statement, rather it is recognized by Query Analyzer and ISQL as a marker to send
all statements from the previous GO to the engine and await the result set.
You can see the effect of this when the results of statements before the GO command
are visible in the results pane of Query Analyzer before the next batch of statements
executes. If you include GO in an ODBC call or from one of the database libraries such
as ADO or OLE DB, the SQL Server engine will flag a syntax error. Another point is that
local variables declared in a batch go out of scope when the batch end, which is one
reason why you cannot include GO in stored procedure declarations.
Transactions
A transaction is a grouping of statements that must be either succeed or fail together as
a logical unit of work. You may have heard of the ACID (Atomicity, Consistency,
Isolation and Durability) test for a transaction. These are the four properties which a
valid transaction must meet.
Atomicity means that all the statements part of the group must either succeed
or fail together. For example, if you were developing a banking application to
add new accounts, the transaction of adding rows to the customer table and the
account table might for a logical unit of work in which our data model integrity
constraints require that records must exist in both tables to be valid.
Consistency means that the database is left in a stable state at the end of the
transaction, and that all data integrity constraints are met.
Isolation defines that the data is modified either before or after other
transactions against the same rowset. Basically we cannot alter the data that is
at an intermediate state. The locking rules of the database prevent this from
happening.
Durability means that the effects of the transaction are persisted after the end
of the transaction, even in the event of a system failure. A system failure in the
middle of a transaction would roll back the state of the database to where it was
before the transaction began.
We manage transactions with the command BEGIN TRANSACTION, and end it with the
command COMMIT or ROLLBACK. When beginning a transaction, locks on the
database prevent other user sessions from altering the data.
For example, suppose we want to implement a sale price on our Products at Northwind.
BEGIN TRAN
DECLARE @Discount float
SET @Discount = 0.05
UPDATE Products
SET UnitPrice = UnitPrice - (@Discount * UnitPrice)
Go
-- Check the results...
Select
ProductID, UnitPrice
FROM Products
ORDER BY UnitPrice Desc
If we don't like what we see, we can roll back the transaction and leave the database in
its previous state
ROLLBACK
GO
SELECT
ProductID, UnitPrice -- See the original prices!
FROM Products
ORDER BY UnitPrice Desc
GO
scripts to complete the tasks involved. The Enterprise manager allows us to generate
these scripts, but we still should understand what the scripts are doing in case we need
to customize or change the script.
Managing Tables
Our first task in making our data model usable is to create base set of tables. We do this
using the CREATE TABLE statement which specifies how information is stored in the
table. Each column definition includes a name, data type, whether the column allows
null values to be stored or not, and other optional attributes. The basic syntax diagram
for the CREATE TABLE is below:
CREATE TABLE
()
A basic simple example of the ReplenishWork table can be created with the following
statement:
CREATE TABLE ReplenishWork
(SupplierID int NOT NULL,
CmpyName varchar(40) NOT NULL,
ProductsOutOfStock int NULL)
Business changes drive database changes and we don't always have the luxury of
recreating our schema in a production environment. Transact SQL allows us to expand
column sizes and add new columns to existing tables using the ALTER statement:
ALTER TABLE ReplenishWork
ADD ContactName varchar(20)
We can also change the size and type of columns, drop columns, manage constraints
and default values using the ALTER statement. If we didn't need to preserve existing
data and were making changes to the schema that prohibited the use of the ALTER
command we can DROP the table before recreating it. If we needed to reconfigure our
ReplenishWork table to support integration of legacy data that used a character based
supplier id we could do so with the following batch:
DROP TABLE ReplenishWork
GO
CREATE TABLE ReplenishWork
(SupplierID varchar(16) NOT NULL,
CmpyName varchar(40) NOT NULL,
ProductsOutOfStock int NULL,
ContactName varchar(20) NULL)
Clustered index on a table, and this index will determine the ordering of how the data is
physically stored. SQL Server supports up to 249 non-clustered indexes for any given
table. To add an index to our ReplenishWork table we could use the following command
CREATE TABLE ReplenishWork
(SupplierID int PRIMARY KEY CLUSTERED,
CmpyName varchar(40) NOT NULL,
ProductsOutOfStock int NULL)
By default SQL Server will use clustering on the primary key defined for a table. This is
important because without at least one clustered index on a large table, all the data
would be scattered throughout the physical storage, and performance on retrieval of
data may suffer. You can also create indexes on existing tables using the CREATE
command:
CREATE INDEX IDX_ReplenishWork
ON ReplenishWork (CmpyName)
On the other hand unnecessary indexes can impact performance as well. Think of an
index as a physical data structure that has a cost associated with maintaining it when
data changes. If the nature of your system involves capturing transactional information
as quickly as possible, you want to run with the least number of indexes. Each insert,
update and delete must be reflected in each of the indexes on the table, so the fewer
the indexes the less the overhead of those operations.
Systems which are primarily decision support and reporting based can afford to have
more indexes because once the data has been written it isn't updated as much as a
transactional system. You will tend to see more generous use of indexes in Data
Warehousing systems that support a lot of Ad-Hoc querying.
Create Views
Views allow us to provide an abstraction layer between the physical implementation of
the data and how users see the information. They are in essence virtual tables that we
can use to limit or expand how we are able to use the base set of tables in our
database. In SQL Server we are limited to a single SQL Select
statement to define the result set. The security architecture of SQL Server allows us to
grant users access to a view without giving access to tables (and views) on which the
view is based. For example, if we needed to make an employee phone list available to
users but needed to protect other employee information (such as salary, social security
number, birth date, etc) we could create a view of just the name & phone.
CREATE VIEW PhoneList AS
SELECT
FirstName + ' ' + LastName as EmpName, HomePhone
FROM Employees
go
select
* from PhoneList
The ALTER and DROP commands operate in the same way they do with tables. One
interesting feature of SQL Server is that it supports indexed and updateable views. This
means that you can define a view on some base objects, create indexes to optimize
access and even alter data stored in the underlying tables, although you need to be
aware that there is more overhead in updating a view than if you managed the data in
the underlying tables directly.
Schema Binding
In order to create an index on a view, the view must be created with the
SCHEMABINDING option. This option saves information about the underlying schema
is stored in the system tables and SQL Server will prevent changes that would
invalidate our View.
CREATE VIEW PhoneList
WITH SCHEMABINDING
AS
SELECT
FirstName + ' ' + LastName as EmpName, HomePhone
FROM Employees
go
objects. The term CRUD Matrix refers to the rights to Create, Read, Update, and/or
Delete information in a given table.
GRANT
Grant is used to give access to someone who needs to use the information
stored in the objects. SQL Server allows us to specify user based or group
based rights, and to apply them to objects or to groups of objects within a
database.
REVOKE
We use the REVOKE statement to reverse the effect of the GRANT statement.
DENY
The DENY statement will prevent a user from accessing objects in the
database.
BIGINT:
True to its name, this is a very big whole number. You can use it to store values
in the range of -9223372036854775808 to 9223372036854775807. This is 19
digits of precision that can hold some extremely large values. If you are porting
data from platforms that include intelligent keys (like the first 4 digits is the
location, the next 6 is customer, the next 3 is the weight, etc.) it would probably
be better to normalize your schema and transform the data into multiple
columns.
UniqueIdentifier:
Columns of this type store system generated identifiers known as GUID's or
Globally Unique Identifier. The GUID is a 38 character value generated by
windows that is guaranteed to be unique across the world. It is takes the same
form as the CLSID in the COM world. The NEWID function is used to generate
it, and we can store them in the column type UniqueIdentifier.
While this makes the replication scheme possible, you should define a clustered primary
key on some other columnset. Ordering the data according to a randomly generated
value will result in poor performance when processing the data because the data is
scattered all over the disk in no order means there and there is increased seek time.
SQL_Variant:
Similar to the variant data type in Visual Basic, SQL_Variant is used to store data
values of all the supported data types not including text, ntext, image, and timestamp.
We can use it for column definitions, parameters, variables, and for the return value of
user defined functions. Microsoft provides the function SQL_VARIANT_PROPERTY to
return the base data type and other information about data stored.
One use for variants is in defining schema to hold attribute data. Suppose you need to
keep a variable number of optional attributes about a person, such as hair color,
weight, age, number of children, birthdate, favorite pet, etc. In our example we are
collecting what information we can, but because there are a lot of optional attributes,
we end up with a sparsely populated table.
PersonID
(int)
HairColor
(varchar 8)
Brown
Weight
(int)
Age
(int)
Sex
(char 1)
Birthdate
(datetime)
Pet
(varchar6)
Cat
250
32
Dog
1/2/62
Bird
Using SQL Variant we could more efficiently store the same set of information using a
schema that looks like this:
PersonID
(int)
AttributeName
(varchar 12)
AttributeValue
SQL_Variant
HairColor
Brown
Pet
Cat
Weight
250
Age
32
Sex
Pet
Dog
Birthdate
1/2/62
Pet
Bird
Table:
True to its name, the table data type is a variable that temporarily stores a rowset in
memory for later processing that can be used like an ordinary table. It accomplishes this
using space in the TEMPDB, but because it has a well defined scope, it efficiently cleans up
after itself. It also requires less locking and logging than tables created in the TEMPDB,
which results in fewer recompilations of the execution plan. The Table datatype was added
in support of a powerful new feature of SQL Server, the User Defined Function.
You declare a table variable using the DECLARE statement. You can then manage the
content of that table variable, inserting, updating and deleting as needed. Since the scope
of variables is local to the current batch, other processes cannot access your table variable.
declare @t table(PersonID int, AttrNm varchar(12), AttrVal sql_variant)
insert @t values(1, 'HairColor', 'Brown')
insert @t values(1, 'Pet', 'Cat')
insert @t values(2, 'Weight', 250)
insert @t values(2, 'Age', 32)
insert @t values(2, 'Sex', 'M')
insert @t values(2, 'Pet', 'Dog')
insert @t values(3, 'Birthdate', GetDate())
insert @t values(3, 'Pet', 'Bird')
Select
convert(varchar(15),SQL_VARIANT_PROPERTY(AttrVal, 'BaseType')), * from @t
Table Valued Functions return a table variable that can be comprised from one or more statements. If it is
a single table and doesn't specify the column set in the declaration, then the statement doesn't require the
begin/end wrapper, and it is considered to be Inline.
create function udfRegionOrders (@Region varchar(15))
returns TABLE
as
return (select
* from Northwind..Orders where ShipRegion = @Region)
go
select
* from northwind..udfRegionOrders ('RJ')
Multi-statement UDF's allow us to do more complex processing in order to derive the result set. Like
scalar UDF's we can declare local variables, use flow control statements, manage data in locally defined
table variables and call stateless extended stored procedures.
Using our Northwind database for this example, suppose we wanted to create a function that returned
sales ranking information.
CREATE FUNCTION udfTopSales (@nCount int)
RETURNS @retTable TABLE
(Rank int, ProductID int, SaleAmt money)
AS
BEGIN
DECLARE @ProdID int, @Sales money, @Rank int
SET @Rank = 1
DECLARE curSales CURSOR FOR
SELECT productID, Sum(UnitPrice * Quantity - Discount) as SaleAmt
FROM [Order Details]
GROUP BY ProductID
ORDER BY SaleAmt DESC
OPEN curSales
FETCH FROM curSales into @ProdID, @Sales
WHILE @@FETCH_STATUS = 0 BEGIN
IF @Rank <= @nCount
INSERT INTO @retTable
values (@Rank, @ProdID, @Sales)
FETCH NEXT FROM curSales into @ProdID, @Sales
SET @Rank = @Rank + 1
END
CLOSE curSales
DEALLOCATE curSales
RETURN
END
go
select * from dbo.udfTopSales (10)
go
Stored Procedures
While we can run ad-hoc SQL Statements within Query Analyzer, and we have the ability to use
the execute() function in ADO, frequently performed logic can be encapsulated into batches and saved as
Stored Procedures on the server. The benefits from this approach not only include the reduction of work
of testing multiple programs and applications to ensure that the logic is correct, but also from the
execution plan generated on the server when the stored procedure is compiled and saved.
Stored Procedures are a collection of Transact SQL statements which perform a defined set of work. This
can be as simple as updating a single row or returning a configuration setting, to being as complex as
implementing a business rule that requires sending an e-mail alert containing crucial information specific
to the problem. They can take both input and output parameters, and always return an integer value.
There are several types of stored procedures including those that come with SQL Server, user defined
and extended stored procedures.
procedure, complete with parameters and a return code. XP's are implemented using a call to dynamic
link libraries (dll) that conform to the Extended Stored Procedure API. you need to be careful that they are
thoroughly tested and checked for aberrant behavior before using them in a production situation because
these calls are external to SQL Server.
If we had written a standard event management system to log information to the NT Event Log and
wanted to include it in our SQL environment we call the stored procedure sp_addextendedproc:
USE Master
go
EXEC sp_AddExtendedProc xp_LogEvent 'c:\winnt\system32\EventLogger.dll'
Query Analyzer
Query Analyzer is the current ad-hoc user interface that comes with SQL Server. It is installed as part of
the client tools and is typically one of the first places a developer will begin working with SQL. By default it
opens to an edit screen in which we can type our code. It provides context based coloring which is a nice
enhancement from the previous version.
To run a batch of statements, you can click on the Execute button and results pane will be display the
output of the selected batch. By default the entire script will be run, but you can limit it to portions of the
script by selecting what you want to execute and then only the selected statements will be run.