Please start any new threads on our new site at http://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

Our new SQL Server Forums are live! Come on over! We've restricted the ability to create new threads on these forums.

SQL Server Forums
Profile | Active Topics | Members | Search | Forum FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Simple Data modelling question....
 Reply to Topic
 Printer Friendly
Previous Page
Author Previous Topic Topic Next Topic
Page: of 3

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/03/2003 :  20:23:38  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
LOL !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

very good, Rob!!!!

(yes, I'm drunk now !)

- Jeff
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 01/03/2003 :  20:26:23  Show Profile  Reply with Quote
So Jeff, what do you think of the one-to-one design? (Use your surrogate if you like)

Advantages? Disadvantages?

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/03/2003 :  20:39:42  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
Which one-to-one? for transaction locations?

Sure, why not.

I personally think it makes sense to store it with the transaction, as I said even if 99% don't have a location you still need a join to determine that, but i think it's half of one, six dozen of the other.*

Like I mentioned about 10,000 words ago (long day!), if the location is completely dependant on the individual transaction and not anything else, it probably would make sense to have it in each transaction record, but vertically partitioning this table would probably be fine. And you could always create a view that puts the two transaction tables together so you would never even notice.

quote:

(Use your surrogate if you like)



Somebody didn't read the only post in which I actually discussed surrogate vs natural keys, and whether a "client_id" field had to be surrogate.....


(* Yes, I know I reversed the expression. Something I like to do. most people never notice!)

Edited by - jsmith8858 on 01/03/2003 20:52:07
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 01/03/2003 :  23:47:41  Show Profile  Reply with Quote
Jeff,

I really thought we had closure there but you edited your last post with a reference to your "definition" of a natural key.

Please tell me you made that up and it is not in a text book.

It looks like we will have to go back to the beginning.....

<Rant>
In the beginning, there was no data model. Developers built the first information system on no sound theoritical foundation. The most popular was IBM's IMS. Based on a fuzzy concept of "networks", they made a system that relied on redundant data elements to "help" the system understand the data it contained. These redundant data elemenets (pointers) make traveseral across the data sets extremely comlpicated. It was only later that the hierarchy model of data was defined (Interestingly enough it was Dr Codd who did this while showing the superiority of the Relational Model).

One day a man (Dr E F Codd) is sitting in his darkened IBM research office pondering the meaning of data... A light bulb moment occurs when he realises that data can be treated as a special kind of set (relation) and rules enforced by 2-value logic (Do you know that every data type can be constructed from the boolean data type which no SQL based product as ever had?).. So in 1969 the first tentative paper was written. It was for Internal Eyes Only and the general public had to wait until a year later for the now legendary "first" paper to appear.

One of the outstanding things to come from this paper was the fact that the VALUE of the data was enough to identify it. And thus pointers to data where not needed.

A key (natural, primary, candidate) is one in which some data element (or elements) within the data set itself is the uniquely identifing value according to your functional dependencies. (FD) It does not matter how the key is generated, as long at it is part of the data set it identifies.

So to answer your questions..

A phone number is a natural key.
A street address is a natural key.
A Persons Name could be part of a natural key consisting of time of birth, X, Y and Z coordinates of the location of the birth.
A Business Name could be a natural key depending on your requirements.
State names are natural keys. In realtity a State would be a Domain value and therefore is inherently unique.

There is a big difference between a surrogate key and a natural key and that is MEANING. A surrogate key starts to look supisciously like a pointer and hence we are moving further away from the ideals of the relational model.

I have and will use surrogate keys in various situations. It some instances we simply don't have enough information to build a natural key. The Person is a good example, X, Y and Z cooridinates are not usually known for a persons birth. So in desperation (because it is indeed desperate as we are now adding a meaningless attribute to our set) we add an artifical key. Because of this we find we have to do additional integrity checks to ensure that our data is correct. Of course most practioners are unaware of this fact and "lies" creep into our database.

</Rant>

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/04/2003 :  00:03:31  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
So, then, you agree that a client_id or client_number or client_code or whatever you want to call it, which has meaning when assigned as part of a business process which is used to consistently identify a client within a business entity, is a natural key.

I made no attempt to define what a natural key is, just that if you consider any of my examples natural keys -- which we all do -- then also a client_id or code or number or ref# or whatever you wish to call it, which is assigned consistently as part of a standard business process, is a natural key as well. (again, not talking IDENTITY or GUID's here)

No biggie, we are still on the same page! Thanks for the history lesson; just remember, it is now what you KNOW and recite by memory, but what you UNDERSTAND that is important.

Again, if you cannot see the usefulness of creating client codes or id's or numbers or whatever to consistently reference client's within your business enterprise (both in terms of data storage and data entry) then I can't do much to convince you otherwise.

(Do I know that all data is made up of 1 and 0's? *chuckle* )

- Jeff
Go to Top of Page

r937
Posting Yak Master

Canada
112 Posts

Posted - 01/04/2003 :  00:14:43  Show Profile  Visit r937's Homepage  Reply with Quote
david, that wasn't a rant, that was calm, reasoned exposition

i, too, prefer natural keys, but use surrogates whenever warranted

what i object to are DBAs who assign surrogates to every table the modellers send their way, regardless, because that's the way they've always done it

(not that people on this forum would do that)

i've seen an implementation similar to rob's State table, for canadian provinces --
ID(PK) Prov   ProvName
1 ON Ontario
2 PQ Quebec
3 BC British Columnbia

where the DBA creates that gorgeous numeric primary key ("more efficient, dontcha know") and overlooks declaring the unique constraint(s) that would have applied to the Prov code had it been the PK

... with the result that we got rows like
14     PQ      Province du Quebec
16 NT Nunavut
17 NT Northwest Territories


so yeah, rant away, just as natural keys can be abused, surrogates can be too

rudy

Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 01/04/2003 :  00:28:28  Show Profile  Reply with Quote
quote:

So, then, you agree that a client_id or client_number or client_code or whatever you want to call it, which has meaning when assigned as part of a business process which is used to consistently identify a client within a business entity, is a natural key.



No. The important word here is "assigned", you, not the data set, have added an additional attribute and thus in not a natural key.

quote:

I made no attempt to define what a natural key is.



Wheren't you saying that if a human has anything to do with they key then it is a surrogate? You are going round in circles.

quote:

(Do I know that all data is made up of 1 and 0's? *chuckle* )



Wow! A new definition of the "truth" values. I was taught they where "true" and "false". Again you have confused logical versus physical.

Maybe a good text book is in order. May I recommend "An Introduction to Database Systems" by Chris Date (Version 7 is the latest I think)


DavidM

"SQL-3 is an abomination.."
Go to Top of Page

Onamuji
Aged Yak Warrior

USA
504 Posts

Posted - 01/04/2003 :  08:30:28  Show Profile  Visit Onamuji's Homepage  Send Onamuji an AOL message  Reply with Quote
You know, until Jeff made the point about client_id ... I was always looking for a better PK for my Projects table than ProjectID ... well you know what, there isn't one, because of the business rules that define how a project number is created (incrementally). So the IDENTITY column works well for this. Which take put substance to Jeff's argument that a number can be a natural key, and in this case and incremental number (IDENTITY).

Go to Top of Page

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/04/2003 :  11:29:43  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
Well, David, unfortunately once agin you are missing my points completely. We may have to get that puppet show up and running! Those alternate forms of communication seemed be the most effective!

quote:

Wheren't you saying that if a human has anything to do with they key then it is a surrogate? You are going round in circles.



I am saying the 100% complete opposite of what you got out of my post. Try reading it one more time:

quote:

What do you think of this statement:

"A standard business process that assigns a unique, consistently formated client_id for each client results in a natural key."

If you disagree, then:



(at this point, it is up to you if you agree with that first statement or not. What I am saying is that if you DISAGREE with that first statement, if that statement is considered FALSE, then logically you must agree to the following)

quote:

For a table of all phones in the world, are phone numbers natural keys? Somewhere, some business process produced a "next" phone number and assigned it to each phone. So it is natural or not? I guess not.

Are Social Security numbers's natural keys? Unique, but some process came up with them. So, apparently not.

Street addresses? Nope, someone used some process to assign street #'s just based on an ascending or descending even/odd pattern per side of the streets. not natural.

People's Names? Nope, the last name is derived from the parents, and a "business process" (the parent's deciding) determines the full name. And they're not even unique at all!

Company Names? Nope, a "business process" meeting was held to determine that name, and again, they're not unique.

States or Abbreviations? Abbreviations are derived from the state name pretty consistently, so those are out. State names didn't come "naturally" out of the soil, some human got together to come up with those, so state names or abbreviations aren't natural keys either.

And so on.



That's it. All keys come from someone, somewhere. Keys that are thought out, reviewed, agreed upon and then consistently used to reference an item are natural keys. If you disagree, then your name is not a natural key, nor you address, nor phone numbers, etc. There are probably very few natural keys, then, out there if you disagree with what I am trying to get accross (maybe finger prints? snowflake patterns?).

Transactions with identity keys or websites that track users and blindly assign ID's to each visitor -- not natural keys, clearly surrogate. A table of states in which a random ascending int "state_id" is added as the key -- not a natural key.

Still with me?

A client signs a contract with their proposed 4-digit member code on it, it gets reviewed, updated, approved, and eventually posted to a finalize table in your system of clients and from that point on "1234" completely defines that client throughout your business enterprise, clients call up and give the billing staff their code for reference, people code invoices, timesheets, expense reports,etc to that "1234" code -- that is as natural key as can exist, believe it or not.

No different than a phone number assinged to a phone, an address assigned to a location, a name assigned to a person, a social security #, etc.

Readly slowly, don't skip words, try to understand my points, any sentences that are confusing or unclear let me know I'll try to make it simplier and more direct.

- Jeff

Edited by - jsmith8858 on 01/04/2003 14:39:30
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 01/04/2003 :  16:01:54  Show Profile  Reply with Quote
Jeff,

I wish you would practice what you preach and read my posts..
If you did, then you would know that I answered your question. For your reference it is the bit about "assigned".

If you could answer this question then I think I could understand where you are coming from...

What is the difference between "assigned" and "blindly assigned" that makes one a natural key and the other a surrogate?

DavidM

"SQL-3 is an abomination.."
Go to Top of Page

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/04/2003 :  16:45:59  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
Here's what you wrote about assigned:

quote:

No. The important word here is "assigned", you, not the data set, have added an additional attribute and thus in not a natural key.



Let's try this one last time. I feel I have written about 3 or 4 sentences total in this thread, but have re-worded the same thoughts about 50-60 times each.

What is a "client"? what is a "project"? what is an "employee"? what is a "department"? YOU define what those are in your company, your database.

If your company policy is that "all clients shall have a 4-digit numeric code" and there is a process which assigns these codes, that is a 100% natural key. a client isn't just a name or a person or a company; it is a name, a join date, attributes specific to YOUR company, a contract potentially, a collection of invoices and receipts, etc. A client number is a way that you, upon creation of that "client" object, decide to reference that client object.

If your company policy is that "all employee have a 3 digit code" and there is a business process that comes up with those codes, that 3-digit code is a 100% natural key. An employee is not just a person, not just a name; it is a combination of a person, a salary, a job, a title, etc, attributes specific to YOUR company. An employee ID is a way that you, upon the creation of that "employee" object, decide to reference that employee object.

What about non-physical things? Projects? Contracts? What is a natural key for those things? Should the key to the Projects table be "consulting for John Q. Smith in which we provide advice regarding ... etc ... " ??? Should the key for the contract be the entire contract's text? If the company has a standard coding convention for these things, then that code is the natural key.

Now, if we are "blindy assigning" codes to these objects I mentioned, the database just pumps out identities or people can make up there own codes, there's no process, no reviewing, no rhyme or reason, they have no meaning, then they are NOT natural keys. Go ahead and call those surrogate keys if you like, I agree with you 100% on that.

Think of the U.S. telephone directory database table -- not a table in which I am storing phone numbers, but THE U.S. telephone directory table. What is the natural key of that table? Phone number? Well, phone numbers are assigned, methodically and based on a process, and are used to reference individual phones throughout the country. A system was designed to generate these phone numbers, in within the context of a specific system -- the phone system -- these numbers define each phone that is connected.

Just as using a client_id will "lead" you to that client's information in the right systems, using a phone number in the right system (in the phone system) will "lead" you to the correct phone at the other end.

So if one is a natural key, the other must be as well. And I hope we both agree that "phone number" in the database table of U.S. phone numbers, is the perfect natural key. Both are generated in exactly the same way.

I think I understand your confusion in that your text books and "database 101" has a neat little definition of a natural key. You are concerned that our enterprise is assigning or making up client ID's and therefore they don't come naturally with the data from our client. What if, when we sign up that client, we say "We like to give our client's numbers for easy reference. How does 1234 sound?" and they say OK, and we say "OK, right that down on your application, along with the rest of your info", and then the data entry person enters all of the data together from that sheet into the system. If we add those steps to our process, then now that client_id is natural, but before it wasn't?

Please make may day by telling me you are getting this!

Thanks

Edited by - jsmith8858 on 01/04/2003 16:58:21
Go to Top of Page

robvolk
Most Valuable Yak

USA
15732 Posts

Posted - 01/04/2003 :  17:58:58  Show Profile  Visit robvolk's Homepage  Reply with Quote
Sorry to do this to you Jeff, but I disagree with your reasoning as far as "the business process" determining if a key is natural or not. It doesn't.

Employee IDs, great example. Suppose you are employed by a company and get an Employee ID. Then you leave the company. Then you come back, same position as before, same boss, same department, same desk, etc. but are given another, different Employee ID than before. Are you a different person? No. Are your other circumstances different? No. Your name is the same. But you are now identified under another identifier that is GENERATED BY A DATABASE SYSTEM, AND NOTHING ELSE. That is not a natural key at all, because it does not utilize any of your other existing attributes to determine if you are a unique employee or not. The business process that determines your identity in the database does not make its keys natural. Think about it. The same thing about signing up a client: if they don't sign up right away, what happens to 1234? Does it go to someone else? If not, what happens if the same client signs up next week? Do they get 1234, or another number? And does 1234 identify that client in ANOTHER database outside of your company?

In fact, by saying "If your company policy is that "all clients shall have a 4-digit numeric code" you've automatically indicated the key is NOT natural, because changing that business rule would change the key of the table WITHOUT changing any other attributes about the employee. A natural key would not be affected by a (an arbitrary) business rule; it is unique regardless.

In the IRS database, the SSN is a surrogate key; it is generated by their database system. However, in YOUR employee database, if SSN was used as primary key, then it is a NATURAL key: it is an attribute associated with an employee (who exists whether they work for your company or not) and that attribute exists OUTSIDE of your particular database. It is NOT generated by your system in which it is being recorded. You could put this same attribute in two different tables or databases with entirely different keys and the attribute's value would still be the same. The same applies to phone numbers, state codes, zip codes, and the like. The process of how those codes and attributes were initially generated do not affect how they are used in your system; they are merely data that you are storing in a table. These kinds of data are immutable in your system: you cannot arbitrarily change someone's phone number or SSN, because they would no longer be true facts about that person.

I don't mean to be critical, but you're spending a lot of time and effort to explain a mentality that is at odds with the definition of natural keys. I'm not saying you need to pick up a book on elementary database design, but I've never seen your arguments made in ANY database book I've ever read, and my own experience contradicts your statements too. Whatever keys you use in your databases are up to you, but your definition of your examples as "natural" keys is misleading and inaccurate. Sorry.

Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 01/04/2003 :  18:06:12  Show Profile  Reply with Quote
Jeff,

I will make your day and say I understand you. But unfortunately you don't understand me.

I was hoping we wouldn't have to bring metaphysics into it..

A Phone Number is not a Phone Number until it is created. Therefore you don't "assign" Phone Numbers, they are created.

The Client already exists before you "made up" a number.

Please make may day by telling me you understand this important logical difference.


DavidM

"SQL-3 is an abomination.."
Go to Top of Page

jsmith8858
Dr. Cross Join

USA
7423 Posts

Posted - 01/04/2003 :  18:31:45  Show Profile  Visit jsmith8858's Homepage  Reply with Quote
First -- who says a client or an employee Id has to change when someone comes back? Doesn't at my company, by the way -- and it happens here occasionally. Remember, EMPLOYEE ID is not just some SILLY IDENTITY that just COMES OUT WHEN RECORDS ARE ADDED TO A TABLE!!! Arggg! I guess I am still not being clear on that.

Alright, one more try.

We have a table of US states; no data is defined about those states within our database, we are just storing U.S. States and some public domain info about those states (state bird, capital, etc) in a table.

state_id identity field? surrogate key
state_name or state_abbreviation? natural key

With me so far?

Next, a table of names of people. All outside data, their father, their mother, favorite color, age, etc. We are storing nothing but data that lives in the outside world, like states or colors or whatever, with no custom attributes, in our system.

Person_Name ? natural key (though maybe not a good one)
Person_id? surrogate key

Everyone still there?

Ok, we have a table of employees in our company. For each employee, we store their hire_date, dept, pay_rate, title, an employee #, job description, office, and other data like their name, ss#, address etc.

Person_name? natural key, probably not unique though
Person_id (identity)? surrogate key
employee_#? natural

DOn't you see? that employee# is just as important, and EXACTLY IN THE SAME CATEGORY as the other, company-specific information for this employee, such as hire date, pay_rate, title, etc.

It is JUST AN ATTRIBUTE that your COMPANY IS DEFINING about that EMPLOYEE, along with pay rate, dept, etc. The EXACT SAME THING. It is NATURALLY STORED IN THE DATA -- NOT ADDED JUST TO SERVE AS A KEY TO YOUR SILLY LITTLE TABLE, but as an important and USEFUL attribute of each EMPLOYEE, again EXACTLY like job, office, dept, etc. And the company's rules say that this employee# must be UNIQUE, so that is the PERFECT NATURAL KEY for your employee table!

(exhales .... )

- Jeff

Edited by - jsmith8858 on 01/04/2003 18:42:56
Go to Top of Page
Page: of 3 Previous Topic Topic Next Topic  
Previous Page
 Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.1 seconds. Powered By: Snitz Forums 2000