Professional Documents
Culture Documents
Sorogate Key
Sorogate Key
Sorogate Key
A Surrogate Key in SQL Server is a unique identifier for each row in the table. It is just a key. Using this key
we can identify a unique row. There is no business meaning for Surrogate Keys. This type of key is either
database generated or generated via another application (not supplied by user).
A Surrogate Key is just unique identifier for each row and it may use as a Primary Key. There is only
requirement for a surrogate Primary Key, which is that each row must have a unique value for that
column. A Surrogate Key is also known as an artificial key or identity key. It can be used in data
warehouses.
Unique Value
The key is generated by the system, in other words automatically generated
The key is not visible to the user (not a part of the application)
It is not composed of multiple keys
There is no semantic meaning of the key
Generally, a Surrogate Key is a sequential unique number generated by SQL Server or the database itself.
The purpsoe of a Surrogate Key is to act as the Primary Key. There is a slight difference between a
Surrogate Key and a Primary Key. Ideally, every row has both a Primary Key and a Surrogate Key. The
Primary Key identifies the unique row in the database while the Surrogate Key identifies a unique entity in
the model.
Note that Surrogate Keys are never used with any business logic other than simple Create, Read, Update
and Delete (CRUD) operations.
A Surrogate Key can be implementing by an auto incremented key. SQL Server supports an
IDENTITY column to perform the auto increment feature. It allows a unique number to be
generated when a new record is inserted into the database table.
A Surrogate Key can be implemented by manual incremental key. Using the max() function we
can find a maximum value of a column and this value is incremented by one. This approach
suffers from a performance problem when a table has a large amount of data.
--Example
DECLARE @newId INT
SELECT @newId = ISNULL(MAX(EmployeeId),0)+ 1 FROM EmployeeMaster
PRINT @newId
--The varible @newId can be used as indentifier of newly inserted data.
GUID is a Microsoft standard that extends Universally Unique Identifier (UUID). Using a NEWID()
function we can generate a new GUID in SQL Server. It is a 16 byte GUID.
--Example
DECLARE @newID UNIQUEIDENTIFIER
SET @newID = NEWID()
--The varible @newId can be used as indentifier of newly inserted data.
NEWSEQUENTIALID() can be used with DEFAULT constraints on the table column of type
uniqueidentifier. We cannot use the NEWSEQUENTIALID() function as a reference in queries.
UUID is 128 bit values that are created from hash of the ID of Ethernet card and current data time
of SQL Server.
A Surrogate Key does not change so the application cannot lose their reference row in the
database.
If the Primary Key is changed then the related foreign key does not change across the database
because the Surrogate Key is used as a reference key. In other words, the Surrogate Key value is
never changed, so the foreign key values become stable.
A Surrogate Key is most often a compact data type such as an integer. A Surrogate Key is less
expensive in a "Join" than the compound key.
Business logic does not something in this key.
A table always has a uniform Surrogate Key, so some tasks can be easily automated by writing the
code table independently.
There is no locking contention because it is a unique identifier.
A Surrogate Key does not require an extra field; that helps to save space in the database.
The relationship between any two tables is simple and consistent in SQL code expressions.
Object Relational Mapping (ORM) frameworks such as Entity Framework, N-Hibernate, and so on
are designed to work optimally with Surrogate Keys. It is very simple to implement them over the
composite keys.
It allows for a higher degree of normalization, so data is not duplicated within the database.
The alternative to a Surrogate Key is Natural Keys. A Natural Key is a true unique identifier in the
database. It is a single value or composite value that has business meaning. The Natural Key can be one or
more columns with any data type. If there is no Surrogate Key on table then there is no need to create a
unique index or sequence on a database table, so it helps us to reduce administrative overhead.
A query join may become complex because the Natural Key can have one or more columns.
It is a reduced normalization form.
It is very difficult to use and time consuming with ORM because ORM is designed to work best
with Surrogate Keys.
The key type is not consistent.
More work is required to change a Natural Key when the foreign key relationship has been built
by a Natural Key.
A Natural Key is larger than a Surrogate Key.
A Natural Key can be any data type, so it might require a long execution time in a "join" query.
For example, if there is a VARCHAR data type as a Natural Key type then the join between two
tables may take more time to produce output.
A Natural Key is assigned by the application, so there is no way to know whether a record is new
or an existing record.
Conclusion
A Surrogate Key is unique in the database table; it is just like an artificial or alternative key to a Primary
Key because a Primary Key may be alphanumeric or a composite key. A Surrogate Key is always unique
per table.
Surrogate Keys offer many benefits. Simplicity, consistency and stability, makes the use of an ORM
extremely feasible. We can use a Natural Key instead of A Surrogate Key when A Natural Key is small and
this key is never updated.
32
4
I googled a lot, but I did not find the exact straight forward answer with an example.
asktonishant
16713
asked Apr 21 '16 at 14:43
Dom
171129
add a comment
6 Answers
activeoldest votes
50
The primary key is a unique key in your table that you choose that best uniquely identifies a record in
the table. All tables should have a primary key, because if you ever need to update or delete a record
you need to know how to uniquely identify it.
A surrogate key is an artificially generated key. They're useful when your records essentially have no
natural key (such as a Person table, since it's possible for two people born on the same date to have
the same name, or records in a log, since it's possible for two events to happen such they they carry
the same timestamp). Most often you'll see these implemented as integers in an automatically
incrementing field, or as GUIDs that are generated automatically for each record. ID numbers are
almost always surrogate keys.
Unlike primary keys, not all tables need surrogate keys, however. If you have a table that lists the
states in America, you don't really need an ID number for them. You could use the state abbreviation
as a primary key code.
The main advantage of the surrogate key is that they're easy to guarantee as unique. The main
disadvantage is that they don't have any meaning. There's no meaning that "28" is Wisconsin, for
example, but when you see 'WI' in the State column of your Address table, you know what state
you're talking about without needing to look up which state is which in your State table.
Bacon Bits
21.1k43041
I think the main disadvantage is that sometimes when people use an autogenerated key
(and integers are often used instead of the natural key not just when no natural key
exists), they often forget to put unique indexes on the natural key that they didn't choose
as the PK. This often allows duplicates to get into the system which can create
problems. The two main advantages of autogenerated keys are that they generally
increase performance in the joins (if integers not GUIDS) and they prevent mass
updating of lots of child records when the Natural Key changes. – HLGEM Apr 21 '16 at 15:22
@HLGEM Sure, I'll buy those. I think I focus on lack of meaning because I've just
worked in hypernormalized systems where essentially every field was it's own table. It
made it impossible to tell where data entry errors had occurred, and very difficult to
apply business rules to locate problems. – Bacon Bits Apr 21 '16 at 18:02
I love normalization but you can indeed take it too far. – HLGEM Apr 21 '16 at 18:13
@HLGEM: Interesting, tell me more. You knew your love for normalization had gone to
far when...? – onedaywhen Oct 13 '16 at 15:27
@onedaywhen Imagine a database which never allowed null values for any field. As
such, nearly every field is in it's own table, and nearly every join is an outer join because
still don't have complete data. So, you haven't actually eliminated nulls, you've just
eliminated storing them. Trying to validate such a system with business rules after the
fact is virtually impossible because it's so difficult to compare records. Since every
potential relation might be many-to-one, you have partial cross joins appearing. And the
performance impact of every query having dozens of joins? – Bacon Bits Mar 16 '17 at 17:55
show 4 more comments
6
A surrogate key is a made up value with the sole purpose of uniquely identifying a row. Usually,
this is represented by an auto incrementing ID.
Example code:
tobypls
664516
add a comment
3
All keys are identifiers used as surrogates for the things they identify. E.F.Codd explained the
concept of system-assigned surrogates as follows [1]:
Database users may cause the system to generate or delete a surrogate, but they have no control over
its value, nor is its value ever displayed to them.
This is what is commonly referred to as a surrogate key. The definition is immediately problematic
however because Codd was assuming that such a feature would be provided by the DBMS. DBMSs
in general have no such feature. The keys are normally visible to at least some DBMS users as, for
obvious reasons, they have to be. The concept of a surrogate has therefore morphed slightly in usage.
The term is generally used in the data management profession to mean a key that is not exposed and
used as an identifier in the business domain. Note that this is essentially unrelated to how the key is
generated or how "artificial" it is perceived to be. All keys consist of symbols invented by humans or
machines. The only possible significance of the term surrogate therefore relates how the key is used,
not how it is created or what its values are.
[1] Extending the database relational model to capture more meaning, E.F.Codd, 1979
nvogel
21k12964
add a comment
1
This is a great treatment describing the various kinds of keys:
http://www.agiledata.org/essays/keys.html
shareimprove this answer
answered Apr 21 '16 at 14:46
n8wrl
16.4k45090
add a comment
1
A surrogate key is typically a numeric value. Within SQL Server, Microsoft allows you to define a
column with an identity property to help generate surrogate key values.
The PRIMARY KEY constraint uniquely identifies each record in a database table. Primary keys
must contain UNIQUE values. A primary key column cannot contain NULL values. Most tables
should have a primary key, and each table can have only ONE primary key.
http://www.databasejournal.com/features/mssql/article.php/3922066/SQL-Server-Natural-Key-
Verses-Surrogate-Key.htm
shareimprove this answer
answered Apr 21 '16 at 15:00
Bishoy Frank
1147
add a comment
-1
I think Michelle Poolet describes it in a very clear way: