Professional Documents
Culture Documents
Database Concepts
Database Concepts
A database is a logically coherent collection of data with some inherent meaning, representing
some aspect of real world and which is designed, built and populated with data for a specific
purpose.
It is a collection of programs that enables user to create and maintain a database. In other words
it is general-purpose software that provides the users with the processes of defining,
constructing and manipulating the database for various applications.
u
1. Physical level: The lowest level of abstraction describes how data are stored.
2. Logical level: The next higher level of abstraction, describes what data are stored in
database and what relationship among those data.
3. View level: The highest level of abstraction describes only part of entire database.
è
!"
#$
!
1. Entity Integrity: States that ?Primary key cannot have NULL value?
2. Referential Integrity: States that ?Foreign Key can be either a NULL value or should be
Primary Key value of other relation.
D
%
Extension - It is the number of tuples present in a table at any instance. This is time dependent.
Intension - It is a constant value that gives the name, structure of table and the constraints laid
on it.
&
#
'($
System R was designed and developed over a period of 1974-79 at IBM San Jose Research
^enter. It is a prototype and its purpose was to demonstrate that it is possible to build a
Relational System that can be used in a real life environment to solve real life problems, with
performance at least comparable to that of existing system.
O Research Storage
O System Relational Data System.
c) *'
$ $
#
$ $
cc "
+
Data independence means that ?the application is independent of the storage structure and
access strategy of data?. In other words, The ability to modify the schema definition in one
level should not affect the schema definition in the next higher level.
1. Physical Data Independence: Modification in physical level should not affect the logical
level.
2. Logical Data Independence: Modification in logical level should affect the view level.
c
'*'
+
A view may be thought of as a virtual table, that is, a table that does not really exist in its own
right but is instead derived from one or more underlying base table. In other words, there is no
stored file that direct represents the view instead a definition of view is stored in data
dictionary.
Growth and restructuring of base tables is not reflected in views. Thus the view can insulate
users from the effects of restructuring and growth in the database. Hence accounts for logical
data independence.
c
A collection of conceptual tools for describing data, data relationships data semantics and
constraints.
c ,-#
E-R model stands for Entity-Relationship model. This data model is based on real world that
consists of basic objects called entities and of relationship among these objects. Entities are
described in a database by a set of attributes.
c . (
.
This model is based on collection of objects. An object contains values stored in instance
variables with in the object. An object also contains bodies of code that operate on the object.
These bodies of code are called methods. Objects that contain same types of values and the
same methods are grouped together into classes.
cu ,
cè ,+
cD ,
c& ,%
+
The collections of entities of a particular entity type are grouped together into an entity set.
)
/,
An entity set may not have sufficient attributes to form a primary key, and its primary key
compromises of its partial key and primary key of its parent entity, then it is said to be Weak
Entity set.
c $
#
#
A relation Schema denoted by R(A1, A2, ?, An) is made up of the relation name R and the list
of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation
which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-values t=(v1,v2, ...,
vn).
#
#
+
#
+
- The collection (or set) of similar relationships.
#
+0+
- Relationship type defines a set of associations or a relationship set among
a given set of entity types.
#
+0+
- It is the number of entity type participating.
12
1$
3
A data base schema is specifies by a set of definitions expressed by a special language called
DDL.
u 4124
'
1$
3
è 12
1$
3
This language is to specify the internal schema. This language may specify the mapping
between two schemas.
D
-
1$
The storage structures and access methods used by database system are specified by a set of
definition in a special type of DDL called data storage-definition language.
& 12+$1$
3
This language that enable user to access or manipulate data as organised by appropriate data
model.
O Procedural DML or Low level: DML requires a user to specify what data are needed
and how to get those data.
O Non-Procedural DML or High level: DML requires a user to specify what data are
needed without specifying how to get those data.
c 15+
It translates DML statements in a query language into low-level instruction that the query
evaluation engine can understand.
6$
$
1"
+
#
---
The Low level or Procedural DML can specify and retrieve each record from a set of records.
This retrieve of a record is said to be Record-at-a-time.
---
-
The High level or Non-procedural DML can specify and retrieve many records in a single DML
statement. This retrieve of a record is said to be Set-at-a-time or Set-oriented.
u #
It is procedural query language. It consists of a set of operations that take one or two relations
as input and produce a new relation.
è #
5 $$
It is an applied predicate calculus specifically tailored for relational databases proposed by E.F.
^odd. E.g. of languages based on it are DSL ALPHA, QUEL.
D *'
0$+
-
$$
-
$$
The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted values are
tuples of that relation. E.g. QUEL
The domain-oriented calculus has domain variables i.e., variables that range over the
underlying domains instead of over relation. E.g. ILL, DEDU^E.
& 7
It is a process of analysing the given relation schemas based on their Functional Dependencies
(FDs) and primary key to achieve the properties
O Minimizing redundancy
O Minimizing insertion, deletion and update anomalies.
) $
+
Functional dependency is denoted by X --> Y between two sets of attributes X and Y that are
subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The
constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This
means the value of X component of a tuple uniquely determines the value of component Y.
c
$
+
O Every dependency in F has a single attribute for its right hand side.
O It cannot replace any dependency X -->A in F with a dependency Y--> A where Y is a
proper subset of X and still have a set of dependency that is equivalent to F.
O We cannot remove any dependency from F and still have set of dependency that is
equivalent to F.
$$
+
Multivalued dependency denoted by X-->Y specified on relation schema R, where X and Y are
both subsets of R, specifies the following constraint on any relation r of R: if two tuples t1 and
t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with the following
properties
1
(++
It guarantees that the spurious tuple generation does not occur with respect to relation schemas
after decomposition.
c8283
The domain of attribute must include only atomic (simple, indivisible) values.
$$
+
u 8
è 8
A relation schema R is in 3NF if it is in 2NF and for every FD X --> A either of the following is
true
O X is a Super-key of R.
O A is a prime attribute of R.
In other words, if every non prime attribute is non-transitively dependent on primary key.
D
582
-583
A relation schema R is in B^NF if it is in 3NF and satisfies an additional constraint that for
every FD X --> A, X must be a candidate key.
& 8
A relation schema R is said to be in 4NF if for every Multivalued dependency X --> Y that
holds over R, one of following is true
) 8
A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn} that holds
R, one the following is true
O Ri = R for some i.
O The join dependency is implied by the set of FD, over R in which the left side is key of
R.
1. What is a DDL, DML, D^L, T^L and DSPL concept in RDBMS world?
0
1$
213 $
9
0
+$1$
213 $
9
0
51$
2513 $
9
0
0 520513 $
9
1:
$
1$
came to relational databases relatively late in
the game ± and thus the languages used for triggers, event handlers, and stored procedures are
completely different among the database vendors. Oracle¶s PL/SQL is quite different even in
statement syntax from SQL Server¶s Transact SQL which in turn differs again from DB2¶s
Stored Procedure language. And of course given the underlying differences in DDL, DML, and
D^L it is inevitable that the stored procedure languages would vary in content as well as
syntax.
/+
2. Define candidate key, alternate key, composite key.
A candidate key is one that can identify each row of a table uniquely. Generally a candidate key
becomes the primary key of the table. If the table has more than one candidate key, one of them
will become the primary key, and the rest are called alternate keys.
A key formed by combining at least two or more columns is called composite key.
/+
3. How to Reset the Identity Values?
1. DB^^ ^HE^KIND(TABLENAME,RESEED,0)
2. Truncate table.
ü#80
^reates an entry in the security system that allows a user in the current database to work with
data in the current database or execute specific Transact-SQL statements.
Syntax
Statement permissions:
Object permissions:
GRANT
{ ALL [ PRIVILEGES ] | permission [ ,...n ] }
{
[ ( column [ ,...n ] ) ] ON { table | view }
| ON { table | view } [ ( column [ ,...n ] ) ]
| ON { stored_procedure | extended_procedure }
| ON { user_defined_function }
}
TO security_account [ ,...n ]
[ WITH GRANT OPTION ]
[ AS { group | role } ]
#,4.;,
Removes a previously granted or denied permission from a user in the current database.
Syntax
Statement permissions:
Object permissions:
,8<
^reates an entry in the security system that denies a permission from a security account in the
current database and prevents the security account from inheriting the permission through its
group or role memberships.
Syntax
Statement permissions:
Object permissions:
DENY
{ ALL [ PRIVILEGES ] | permission [ ,...n ] }
{
[ ( column [ ,...n ] ) ] ON { table | view }
| ON { table | view } [ ( column [ ,...n ] ) ]
| ON { stored_procedure | extended_procedure }
| ON { user_defined_function }
}
TO security_account [ ,...n ]
[ ^AS^ADE ]
/+
Stored procedures accept input parameters so that a single procedure can be used over the
network by several clients using different input data and stored procedure also returns the
output parameter. Stored procedures reduce network traffic and improve performance. Stored
procedures can be used to help ensure the integrity of the database.
e.g. sp_help,sp_helpdb (Alt + F1), sp_renamedb, sp_depends etc.
Stored Procedure can takes 1024 input and returns the 1024 output parameters.
/+
6 What is Trigger?
Trigger are used to enforce the business rules in the RDBMS. A trigger is a SQL procedure that
initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored
in and managed by the RDBMS. Triggers are used to maintain the referential integrity of data
by changing the data in a systematic fashion. A trigger cannot be called or executed. The
RDBMS automatically fires the trigger as a result of a data modification to the associated table.
Triggers can be viewed as similar to stored procedures in that both consist of procedural logic
that is stored at the database level. Stored procedures, however, are not event-drive and are not
attached to a specific table as triggers are. Stored procedures are explicitly executed by
invoking a ^ALL to the procedure while triggers are implicitly executed. In addition, triggers
can also execute stored procedures.
1. After Trigger
2. Instead of Trigger
Nested Trigger: Like the stored procedure trigger can also be nested upto 32 levels. A trigger
can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is
fired because of data modification it can also cause another data modification, thereby firing
another trigger. A trigger that contains data modification logic within itself is called a nested
trigger. For the nested trigger user has to define the execution order of the trigger.
The trigger will create two magic tables inserted and deleted which contains the structure of
table on which it excutes.
/+
7 What is View?
A view is one type of virtual tables which only stores the SELE^T query without data. User can
perform the Insert/Update/Delete operation on the view. View can give us the better security.
User can define the index on views. The Instead of trigger can fire on the view.
/+
8 What is Index?
Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the
data quicker.
There are clustered and nonclustered indexes. A clustered index is a special type of index that
reorders the way records in the table are physically stored. Therefore table can have only one
clustered index. The leaf nodes of a clustered index contain the data pages.
A nonclustered index is a special type of index in which the logical order of the index does not
match the physical stored order of the rows on disk. The leaf node of a nonclustered index does
not consist of the data pages. Instead, the leaf nodes contain index rows. SQL Server can create
249 Non-clustered index per table.
/+
Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more
information.
Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network
round trip, where as a normal SELE^T query makes only one round trip, however large the
result set is. ^ursors are also costly because they require more resources and temporary storage
(results in more IO operations). Further, there are restrictions on the SELE^T statements that
can be used with some types of cursors.
/+
11. What is the use of DB^^ commands?
DB^^ stands for database consistency checker. We use these commands to check the
consistency of the databases, i.e., maintenance, validation task and status checks.
E.g. DB^^ ^HE^KDB ± Ensures that tables in the db and the indexes are correctly linked.
DB^^ ^HE^KALLO^ ± To check that all pages in a db are correctly allocated.
DB^^ ^HE^KFILEGROUP ± ^hecks all tables file group for any damage.
/+
12 What is a Linked Server?
Think of a Linked Server as an alias on your local SQL server that points to an external data
source. This external data source can be Access, Oracle, Excel or almost any other data system
that can be accessed by OLE or ODB^±including other MS SQL servers. An MS SQL linked
server is similar to the MS Access feature of creating a ³Link Table.´
TRUN^ATE
TRUN^ATE is faster and uses fewer system and transaction log resources than DELETE.
TRUN^ATE removes the data by deallocating the data pages used to store the table¶s data, and
only the page deallocations are recorded in the transaction log.
TRUN^ATE removes all rows from a table, but the table structure and its columns, constraints,
indexes and so on remain. Truncate resets the identity value.
Inline UDF¶s can be though of as views that take parameters and can be used in JOINs and
other Rowset operations.
We can not write the configuration statements in UDFs. UDs can return only one value whereas
SPs can return 1024 output parameters.
/+
17. When is the use of UPDATE_STATISTI^S command?
This command is basically used when a large processing of data has occurred. If a large amount
of deletions any modification or Bulk ^opy into the tables has occurred, it has to update the
indexes to take these changes into account. UPDATE_STATISTI^S updates the indexes on
these tables accordingly.
/+
18 What types of Joins are possible with Sql Server?
Joins are used in queries to explain how different tables are related. Joins also let us select data
from a table depending upon data from another table.
Types of joins: SELF JOINs, MERGE JOINs, INNER JOINs, OUTER JOINs, ^ROSS JOINs.
OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and
FULL OUTER JOINS.
/+
19 What is the difference between a HAVING ^LAUSE and a WHERE ^LAUSE ?
Specifies a search condition for a group or an aggregate. HAVING can be used only with the
SELE^T statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is
not used, HAVING behaves like a WHERE clause. Having ^lause is basically used only with
the GROUP BY function in a query. WHERE ^lause is applied to each row before they are part
of the GROUP BY function in a query. It is the good practice to use WHERE clause with the
Group By for the better performance result.
/+
20 What is SQL Profiler?
It is a tool which help us to profiling the activities at the database level. It is the good practice to
use the profiler from the different machine rather than the production machine.
SQL Profiler is a graphical tool that allows system administrators to monitor events in an
instance of Microsoft SQL Server. We can capture and save data about each event to a file or
SQL Server table to analyze later. For example, you can monitor a production environment to
see which stored procedures are hampering performance by executing too slowly.
Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming
too large, you can filter them based on the information you want, so that only a subset of the
event data iscollected. Monitoring too many events adds overhead to the server and the
monitoring process and can cause the trace file or trace table to grow very large, especially
when the monitoring process takes place over a long period of time.
/+
21 What is User Defined Functions?
User-Defined Functions allow to define its own T-SQL functions that can accept 0 or more
parameters and return a single scalar data value or a table data type.
/+
22 Which T^P/IP port does SQL Server run on? How can it be changed?
SQL Server runs on port 1433. It can be changed from the Network Utility T^P/IP properties ±
> Port number.both on client and the server.
/+
23 What are the authentication modes in SQL Server? How can it be changed?
Windows mode and mixed mode (SQL & Windows).
24 Where are SQL server users names and passwords are stored in sql server?
They get stored in master db in the sysxlogins table.
25. Which command using Query Analyzer will give you the version of SQL server and
operating system?
SELE^T SERVERPROPERTY(¶productversion¶), SERVERPROPERTY (¶productlevel¶),
SERVERPROPERTY(¶edition¶), SELE^T @@version
/+
Transactional replication, an initial snapshot of data is applied at Subscribers, and then when
data modifications are made at the Publisher, the individual transactions are captured and
applied to Subscribers.
Merge replication is the process of distributing data from Publisher to Subscribers, allowing the
Publisher and Subscribers to make updates while connected or disconnected, and then merging
the updates between sites when they are connected.
What are the OS services that the SQL Server installation adds?
MS SQL SERVER SERVI^E, SQL AGENT SERVI^E, DT^ (Distribution transac co-
ordinator)
/+
32 What does it mean to have quoted_identifier on? What are the implications of having it off?
When SET QUOTED_IDENTIFIER is ON, identifiers can be delimited by double quotation
marks, and literals must be delimited by single quotation marks. When SET
QUOTED_IDENTIFIER is OFF, identifiers cannot be quoted and must follow all Transact-
SQL rules for identifiers.
/+
33 What is the STUFF function and how does it differ from the REPLA^E function?
STUFF function to overwrite existing characters. Using this syntax, STUFF(string_expression,
start,length, replacement_characters), string_expression is the string that will have characters
substituted,start is the starting position, length is the number of characters in the string that are
substituted, and replacement_characters are the new characters interjected into the string.
REPLA^E function to replace existing characters of all occurance. Using this syntax
REPLA^E(string_expression, search_string, replacement_string), where every incidence of
search_string found in the string_expression will be replaced with replacement_string.
/+
34 Using query analyzer, name 3 ways to get an accurate count of the number of records in a
table?
SELE^T ^OUNT(*) FROM table1
SELE^T rows FROM sysindexes WHERE id = OBJE^T_ID(table1) AND indid < 2
/+
35. What is the basic functions for master, msdb, model, tempdb databases?
The Master database stores the information about the sql server configuration, databases, users
etc.
The msdb database stores information regarding database backups, SQL Agent information,
DTS packages, backup and restore history, SQL Server jobs, and some replication information
such as for log shipping.
The tempdb holds temporary objects such as global and local temporary tables and stored
procedures.
The model is essentially a template database used in the creation of any new user database
created in the instance.
/+
36 What are primary keys and foreign keys?
Primary keys are the unique identifiers for each row. They must contain unique values and
cannot be null. Due to their importance in relational databases, Primary keys are the most
fundamental of all keys and constraints. A table can have only one Primary key.
Foreign keys are both a method of ensuring data integrity and a manifestation of the
relationship between tables.
/+
A FOREIGN KEY constraint prevents any actions that would destroy links between tables with
the corresponding data values. A foreign key in one table points to a primary key in another
table. Foreign keys prevent actions that would leave rows with foreign key values when there
are no primary keys with that value. The foreign key constraints are used to enforce referential
integrity.
A ^HE^K constraint is used to limit the values that can be placed in a column. The check
constraints are used to enforce domain integrity.
A NOT NULL constraint enforces that the column will not accept null values. The not null
constraints
are used to enforce domain integrity, as the check constraints.
/+
38 What is Identity?
Identity (or AutoNumber) is a column that automatically generates numeric values. A start and
increment value can be set.
/+
/+
39. How do you load large data to the SQL server database?
Bulk^opy is a tool used to copy huge amount of data from tables. BULK INSERT command
helps to Imports a data file into a database table or view in a user-specified format.
/+
40 How to know which index a table is using?
SELE^T table_name,index_name FROM user_constraints
/+
41 How to copy the tables, schema and views from one SQL server to another?
Microsoft SQL Server 2000 Data Transformation Services (DTS) is a set of graphical tools and
programmable objects that lets user extract, transform, and consolidate data from disparate
sources into single or multiple destinations.
/+
42 What is Self Join?
This is a particular case when one table joins to itself, with one or two aliases to avoid
confusion. A self join can be of any type, as long as the joined tables are the same. A self join is
rather unique in that it involves a relationship with only one table. The common example is
when company have a hierarchal reporting structure whereby one member of staff reports to
another.
/+
43 What is ^ross Join?
A cross join that does not have a WHERE clause produces the ^artesian product of the tables
involved in the join. The size of a ^artesian product result set is the number of rows in the first
table multiplied by the number of rows in the second table. The common example is when
company wants to combine each product with a pricing table to analyze each product at each
price.
/+
45. What is an execution plan? When would you use it? How would you view the execution
plan?
An execution plan is basically a road map that graphically or textually shows the data retrieval
methods chosen by the SQL Server query optimizer for a stored procedure or ad-hoc query and
is a very useful tool for a developer to understand the performance characteristics of a query or
stored procedure since the plan is the one that SQL Server will place in its cache and use to
execute the stored procedure or query. From within Query Analyzer is an option called ³Show
Execution Plan´ (located on the Query drop-down menu). If this option is turned on it will
display query execution plan in separate window when query is ran again.
/+
46 What is B^P? When does it used?
Bulk^opy is a tool used to copy huge amount of data from tables and views. B^P does not
copy the structures same as source to destination.
/+
Sign in Rec