There is a fundamental difference between the two. SSN, VIN, ISBN, UPC, etc are verifiable, *standardized*, industry-recognized identifiers. An IDENTITY/GUID may as well be a physical locator: it is unverifiable and has absolutely no meaning in the real
world.
That's exactly the point. An IDENTITY/GIUD has no meaning outside the scope of the system, so it can be completely controlled *by* the system and isn't subject to the variances, outliers, and rules of application that exist in the real world.
Alex Papadimoulis
Tell me which rent-a-car agency has pre-1980 cars. How many companies do you know that lease pre-1980 vehicles? Cars and Antique Cars are two different types of entities in the business world and are generally treated as such.
Case in point. Database designs aren't broken by surrogate keys, they're broken by designers who don't *think* about their designs, and the data that they choose for keys, surrogate or natural. If, for instance, you were building an application for managing
college student housing, and you've picked the VIN number to serve as a primary key on the table where you're tracking their cars, your system would be broken the first time some kid with an antique car he got for graduation drives onto campus. The guy who
build a system using a surrogate key may not have a database that's as relationally pure, but his still works while yours is broken, waiting for you to come back in and change your schema to account for conditions that you hadn't accounted for because you
relied on natural data that turned out to be less consistant than you expected. This isn't just about database systems for the automotive industry, where they have reason to seperate cars between "modern" and "collectible", but in every system that will
track automobiles where that delineation doesn't make sense, and isn't even going to recognized by someone less familiar with VINs.
Alex Papadimoulis
As aforementioned, SSN is not a pointer. I concede it's not good to use everywhere but as a US Employer, all of your employees *must* have a SSN (foreigners need to apply for a 9xx TIN). You can't give them a paycheck otherwise.
I disagree. It *is* a pointer, just a pointer that's assigned by an SSA database, instead of yours. Industry-standardization doesn't make at any more intrinstic to a row describing a person in an SSA database table, than an incremented integer does to a row
describing a employee in yours. Think of an SSN as a short version of a GUID used as an identifer in that SSA database table, and the whole house of cards starts to come down. Just 'cause it gets printed on a card doesn't mean much...I can print my identity
values on a piece of paper too.
Alex Papadimoulis
Yes, they do recycle SSN. But I'm not aware of circumstances where you pay dead employees either.
And people who *have* had recycled SSNs have had very difficult lives becuase of system designers who made faulty assumptions about the uniqueness of an SSN and it's suitabily as a key. They were designers who wanted to use natural keys and unwittingly designed
broken systems that did so. *These* are the systems that end up needing consultants to come in and clean up the debris, and rebuild properly.
Points well made. I think natural keys are nice in theory but not can lead to many problems in real world apps. Synthetic key(s) do not make the system any less maintainable and often make business rule changes easier.
I could be wrong but I think IDENTITY and SEQUENCE which is in ANSI SQL 2003 and implemented in Oracle are not artificial while GUID and Surrogate keys are artificial. Try the links below for more info. Hope this helps.
I could be wrong but I think IDENTITY and SEQUENCE which is in ANSI SQL 2003 and implemented in Oracle are not artificial while GUID and Surrogate keys are artificial. Try the links below for more info. Hope this helps.
That is not correct. Identity values, sequences, auto-numbers, GUIDs...These are all examples of surrogate(artificial) keys.
ANSI SQL 2003 page 57 says IDENTITY column comes from BT( base table) and there can only be one per table so though all auto generated IDENTITY and Surrogate keys are not the same. Run a search for Surrogate key in SQL Server BOL (books online) and you
get four pages all datawarehouse related like the link posted. ANSI SQL 2003 also added IDENTITY are integer based but GUID in SQL Server is 16 bytes binary datatype like the second link I posted. Now I am saying based on SQL Server documentation and ANSI
SQL 2003 they are not the same. The text below is from SQL Server BOL (books online). So I was correct IDENTITY and Surrogate keys are not the same albeit both auto generated.
(The implementation and management of surrogate keys is the responsibility of the data warehouse. OLTP systems are rarely affected by these situations, and the purpose of these keys is to accurately track history in the data warehouse. Surrogate keys are maintained
in the data preparation area during the data transformation process.)
I did not misunderstand the meaning of surrogate key, the only thing a surrogate key have in common with IDENTITY is auto generation. The Oracle FAQ info is obsolete because we are in Oracle 10g and the info was for Oracle 6. And SQL Server is not Sybase
from version 7.0 and we are now in version 2000 and 2005 almost live.
The word Surrogate key is not part of ANSI SQL, from the time it was created it has always been some thing used in ETL(extraction transformation and loading) in Datawarehouse.
mru22
Member
419 Points
134 Posts
Re: Question on design with Identity columns
Aug 29, 2005 08:29 PM|LINK
Scroll down to the August 19th posting.
http://spaces.msn.com/members/drsql/
Xanderno
Contributor
6040 Points
1200 Posts
Re: Question on design with Identity columns
Aug 29, 2005 08:49 PM|LINK
That's exactly the point. An IDENTITY/GIUD has no meaning outside the scope of the system, so it can be completely controlled *by* the system and isn't subject to the variances, outliers, and rules of application that exist in the real world.
Case in point. Database designs aren't broken by surrogate keys, they're broken by designers who don't *think* about their designs, and the data that they choose for keys, surrogate or natural. If, for instance, you were building an application for managing college student housing, and you've picked the VIN number to serve as a primary key on the table where you're tracking their cars, your system would be broken the first time some kid with an antique car he got for graduation drives onto campus. The guy who build a system using a surrogate key may not have a database that's as relationally pure, but his still works while yours is broken, waiting for you to come back in and change your schema to account for conditions that you hadn't accounted for because you relied on natural data that turned out to be less consistant than you expected. This isn't just about database systems for the automotive industry, where they have reason to seperate cars between "modern" and "collectible", but in every system that will track automobiles where that delineation doesn't make sense, and isn't even going to recognized by someone less familiar with VINs.
I disagree. It *is* a pointer, just a pointer that's assigned by an SSA database, instead of yours. Industry-standardization doesn't make at any more intrinstic to a row describing a person in an SSA database table, than an incremented integer does to a row describing a employee in yours. Think of an SSN as a short version of a GUID used as an identifer in that SSA database table, and the whole house of cards starts to come down. Just 'cause it gets printed on a card doesn't mean much...I can print my identity values on a piece of paper too.
And people who *have* had recycled SSNs have had very difficult lives becuase of system designers who made faulty assumptions about the uniqueness of an SSN and it's suitabily as a key. They were designers who wanted to use natural keys and unwittingly designed broken systems that did so. *These* are the systems that end up needing consultants to come in and clean up the debris, and rebuild properly.
mru22
Member
419 Points
134 Posts
Re: Question on design with Identity columns
Aug 29, 2005 09:13 PM|LINK
Caddre
All-Star
26581 Points
5308 Posts
Re: Question on design with Identity columns
Aug 29, 2005 09:33 PM|LINK
I could be wrong but I think IDENTITY and SEQUENCE which is in ANSI SQL 2003 and implemented in Oracle are not artificial while GUID and Surrogate keys are artificial. Try the links below for more info. Hope this helps.
http://www.intelligententerprise.com/showArticle.jhtml?articleID=54200319
http://forums.asp.net/967574/ShowPost.aspx
http://www.firstsql.com/inulls4.htm
Gift Peddie
Xanderno
Contributor
6040 Points
1200 Posts
Re: Question on design with Identity columns
Aug 29, 2005 10:30 PM|LINK
That is not correct. Identity values, sequences, auto-numbers, GUIDs...These are all examples of surrogate(artificial) keys.
Caddre
All-Star
26581 Points
5308 Posts
Re: Question on design with Identity columns
Aug 30, 2005 04:04 AM|LINK
ANSI SQL 2003 page 57 says IDENTITY column comes from BT( base table) and there can only be one per table so though all auto generated IDENTITY and Surrogate keys are not the same. Run a search for Surrogate key in SQL Server BOL (books online) and you get four pages all datawarehouse related like the link posted. ANSI SQL 2003 also added IDENTITY are integer based but GUID in SQL Server is 16 bytes binary datatype like the second link I posted. Now I am saying based on SQL Server documentation and ANSI SQL 2003 they are not the same. The text below is from SQL Server BOL (books online). So I was correct IDENTITY and Surrogate keys are not the same albeit both auto generated.
(The implementation and management of surrogate keys is the responsibility of the data warehouse. OLTP systems are rarely affected by these situations, and the purpose of these keys is to accurately track history in the data warehouse. Surrogate keys are maintained in the data preparation area during the data transformation process.)
Gift Peddie
Xanderno
Contributor
6040 Points
1200 Posts
Re: Question on design with Identity columns
Aug 30, 2005 04:29 AM|LINK
Caddre
All-Star
26581 Points
5308 Posts
Re: Question on design with Identity columns
Aug 30, 2005 04:46 AM|LINK
I did not misunderstand the meaning of surrogate key, the only thing a surrogate key have in common with IDENTITY is auto generation. The Oracle FAQ info is obsolete because we are in Oracle 10g and the info was for Oracle 6. And SQL Server is not Sybase from version 7.0 and we are now in version 2000 and 2005 almost live.
The word Surrogate key is not part of ANSI SQL, from the time it was created it has always been some thing used in ETL(extraction transformation and loading) in Datawarehouse.
Gift Peddie
Xanderno
Contributor
6040 Points
1200 Posts
Re: Question on design with Identity columns
Aug 30, 2005 04:55 AM|LINK
Wrong. If you looked at it, the date on the Oracle FAQ was 12-Aug-2005. Just over two weeks ago.
And that has what to do with the definition of the term "surrogate key"? Go back and read it again. Carefully this time.
Xanderno
Contributor
6040 Points
1200 Posts
Re: Question on design with Identity columns
Aug 30, 2005 05:02 AM|LINK