Author |
Topic |
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 10:03:04
|
I know that it is much quicker to search a database file using integers as the primary key but should a autonumber be used if the only distinct field value of a record would be a name?Take for example, I have a contractor table that lists all the contractors I deal with. The information of course is the Name, address, city, province/state, etc... I created a autonumber field that is the primary key for the table but I am not sure if this is the right way to go. The reason I did this is because this information is retrieved an awful lot in in my application! How would you layout such a table?Mike B |
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 10:28:39
|
quote: I don't like artificial/contrived/surrogate/uniqueifier/whateveryoucallit keys
Ok, so instead of having an autonumber field, you would prefer to make the contractor name the primary key?Would this create any overhead in time to execute a query worth worrying about?Or to take it a step further:Two table, ContractorInformation, ContractorContactsContractorInformationContractorName (PK) | Address | City | etc....ContractorContactsContractorName | ContactFirst | ContactSurname | PhoneWhere in ContractorContact, the combination of ContractorName, ContactFirst, and ContactSurname would be the PK?This would be the prefered way?Mike BEdited by - MikeB on 03/21/2003 10:31:52 |
 |
|
JimL
SQL Slinging Yak Ranger
1537 Posts |
Posted - 2003-03-21 : 11:13:14
|
Useing the Autonumber Field prevents an accidental change from braking the accociated links in subsiquent Table relationships that are only set at Query level.I.E. if you made a change to your contractors name you would loose the accociated data in the sub-Tables unless you had the relationships Locked in at the table level in SQL. (If you do set locked relationships in SQL you may have problems with replication Merges) |
 |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 11:20:37
|
quote: Useing the Autonumber Field prevents an accidental change from braking the accociated links in subsiquent Table relationships that are only set at Query level.
Very valid point!! So you are pro-Autonumber PK?Mike B |
 |
|
Andraax
Aged Yak Warrior
790 Posts |
Posted - 2003-03-21 : 11:28:16
|
In this case I think identity would be better because:A) Data integrity as specified. What happens if the contractor changes their name?B) Does the name have to be unique?C) If many tables are related to the "main" table, there would be an increase of storage space used, and decreased performance if the name field supports large strings. |
 |
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2003-03-21 : 11:46:07
|
quote: In this case I think identity would be better because:A) Data integrity as specified. What happens if the contractor changes their name?
Wrong. Data integrity has nothing to do with it. If a contractor changes their name they are a new contractor. If that is not a valid model of the business rules, then name is a poor choice for primary key.EDIT: Let me explain this. When I say "Data integrity has nothing to do with it.", by "it" I mean the contractor changing his name. Data integrity has everything to do with this discussion. And I think by not using an identity, you do integrity a favor. Changing a contractors name (or whatever the key should be) should violate a constraint. If you use an identity, you are relying on a program to manage integrity rather than building it into you data model.quote: B) Does the name have to be unique?
Wrong. If name does not have to be unique per the business rules, then name is a poor choice for a primary key. (No one ever said database design was easy...)quote: C) If many tables are related to the "main" table, there would be an increase of storage space used, and decreased performance if the name field supports large strings.
Wrong (maybe) and wrong again (maybe).With an identity you are storing the (pk)identity + (fact)name + (fk)identity ... with the natural key you are storing (pk)name + (fk)name. Depending on the lenght of name the natural key may actually take less space. But even if it doesn't, how much does space cost? And is that cost outweighted by the benefits of using the natural key. Maybe the identity will take more space!And then faster...unless you have tested performance with the natural key vs the identity and can provide me with metrics for this exact environment, I am going to rely on the testing that I have done. I say (in 90some% of cases) any performance difference is nearly un-measurable and certainly un-noticable. You tell me? Will the extra width you add to each row (you identity column) decrease the number or rows per page enough to spread the data out far enough to increase the number of pages fetched per query enough to make a differnence? Maybe the identity will actually be slower!I love this debate.... Jay White{0}Edited by - Page47 on 03/21/2003 12:48:16 |
 |
|
JimL
SQL Slinging Yak Ranger
1537 Posts |
Posted - 2003-03-21 : 12:55:31
|
Jay You Have a vaid Point....... However It is a little easyer to manage when useing a newer system.I used to avoid Autonumber because of file import-export-append problems. Howerver DTS has made that so easy now it is of little concern. Autonumber in Script is still a pain but in access XP its easy as Pie. |
 |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 13:25:20
|
I think I see what Andraax means by the following: quote: A) Data integrity as specified. What happens if the contractor changes their name?
In the following tables:ContractorInformationContractorID (PK, Autonumber) | Name | Address | City | etc....ContractorContactContactID | ContractorID | FirstName | SurnameProjectInformationProjectID | ContractorID |In the above table structures, the field ContractorID being a the primary key of the contractor information allows the contractor to be virtually any one who could be specified by the user in the application. Changing the name does not break the relationship between the three tables. If however, the ContractorsName is used as the PK, then what if the name of the contractor is entered wrong, or the contractor changes? You would have to change the value in all three tables. This is where I think he means the integrity of the data can be lossed. I am not a database guru by any means, but I feel more comfortable with using the Autonumber field then contractor name for example. But I am also not an experienced programmer, so I am trying to learn the ropes of consistent and stable database design!Any additional thoughts?Mike B |
 |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 13:29:49
|
This is starting to feel like I am beating a dead hoarse but....What do I do with the contractor contacts for example?ContractorContactContractorName | ContactFirst | ContactSurname |There can easily be two Tom Jones working for the same contractor (although highly doubtful). How would you fit a primary key into this?Mike B |
 |
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2003-03-21 : 13:49:22
|
Mike I do understand Andraax's argument: I've had this discussion many times.  quote: Changing the name does not break the relationship between the three tables. If however, the ContractorsName is used as the PK, then what if the name of the contractor is entered wrong, or the contractor changes? You would have to change the value in all three tables.
But this is exactly my point ... except I'm coming to the opposite conclusion. If the name is the primary key it can't change. Infact, you should build into your data model such that it can't change. If you've got a recipe and you change the recipe, it's a different recipe. If you've got a phone number and you change your phone number, you've got a different phone number. If you have a contractor and the contractor changes their name, you've got a different contractor. I don't want to go around changing my phone number but then telling people I've got the same old phone number, do I?Well, what if its a mistake? Thats a good point. First, you should have been more careful (or made your app more careful) . But second ... so, you have to change it in three tables. So do it. No biggie, is it?Now, I don't think a name is a good primary key. In that case, I'd look for that is unique and occurs naturally in the data. I know in the US every business has a Tax ID given by the fed...that would work.quote: What do I do with the contractor contacts for example? ContractorContact ContractorName | ContactFirst | ContactSurname |
I've got a wild idea for you. Only store 1 row!! If all you know about a contact is the first and surname, then what's the point of trying to differential between two different Tom Jones's?But you say, "No, dude, I know the email too" ... to which I reply ... "There you go!"Jay White{0} |
 |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 14:02:20
|
quote: Now, I don't think a name is a good primary key. In that case, I'd look for that is unique and occurs naturally in the data. I know in the US every business has a Tax ID given by the fed...that would work.
Well, in the application I am developing, this is not known by the user and will not be known by the user. Besides, it is not important to our business so realistically, it would be just a useful as the Autonumber.quote: Only store 1 row!! If all you know about a contact is the first and surname, then what's the point of trying to differential between two different Tom Jones's?
hehehe, I thought about that right after I pressed "Post New Reply". This is not a business type software, just something for tracking project developement. The name of the contractor is the only uniqueness between records, that as you say, should not change! I don't think it is a good Primary key however, because a name can contain 50 + characters. Doing queries with this would probably be much slower then with an integer such as a primary key?What would you do if you were I Page47? Mike B quote:
quote:
|
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2003-03-21 : 14:18:47
|
Jay -- when you get married and change your name, do you become a different person? This argument tends to go over the heads of people a bit in these forums, but I just want to remind everyone:A "ContractorID" is not always a surrogate key.You may decide, as a business decision (not a database decision!), that it makes sense to assign a unique 4-digit code of some sort to each contractor to print on invoices or for phone call referals or for coding costs or materials or whatever. Pretend you don't even know or care what a database is, it is just your businesses policy to assign a code to each contractor. It's pretty common on the real world to do this, of course, and it was done in the days before databases even!Now, suppose you decide you would like to store all of the info for all of your contractors info into a database. Since we are storing ALL of the info, that includes the unique 4-digit code. Then, that unique compact 4-digit code is a perfect candidate for a natural key for your contractors table, since it will never change, it's small, and unique for each contractor.Having said all that, if your computer is just creating autonumber ContractorID's fields every time a data entry person puts an entry into your "Contractors" table, then that would NOT be a natural key.Makes sense?Please ... if you can't follow the argument (if you are thinking "so you are saying an identity field is a natural key?" or "what if a customer leaves and comes back and the identity field gives them a new number?" or thinking anything about database related identity or autonumber fields) please don't reply .... - JeffEdited by - jsmith8858 on 03/21/2003 14:19:57 |
 |
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2003-03-21 : 14:33:54
|
quote: I don't think it is a good Primary key however, because a name can contain 50 + characters. Doing queries with this would probably be much slower then with an integer such as a primary key?
Reread my post above where I talk about a varchar being slower. I think with a varchar(50) your queries will be fine. You owe it to Codd to at least try the natural key first and if performance is unacceptable, then look to a numeric key. I've heard this myth so many times, but I've never seen good studies to prove it. I've tested very large datasets with different datatyped keys and found I could get acceptable performance out of many things.quote: so realistically, it would be just a useful as the Autonumber.
I disagree here too. A contractor identified by "47" (a auto-generated number) is 100% meaningless. You have no way of verifying it in reality or verifying it in itself. On top of that, it isn't an attribute of the contractor. Go ask your contractor, "What is you Contractor Autonumber?" With a Tax ID there will never be a question about who you are talking about. If you data gets corrupted, or you suspect a problem, I'd bet your contractor will know his Tax ID. Sometimes, in order to model data properly, you need to go outside the given domain ...quote: What would you do if you were I Page47?
I wouldn't use Access for one thing. ... Really though, that's an unfair question. You should be able to infer my persuasion in what I have written. But without having a full understanding of your business rules and requirements, I'd be a fool to tell you "what I would do" ... If, however, you'd like me to gain a full understanding of your domain and answer your question professionally, I'd be happy to do that offline for a competative hourly rate.Jay White{0} |
 |
|
JimL
SQL Slinging Yak Ranger
1537 Posts |
Posted - 2003-03-21 : 14:40:17
|
Jays Tryin to drum up business. I still Like ID Numbers and Sub-Id numbers But Then my tables can go 5 Levels easy and my users never see the Id's because I use Sub forms In access that Autoset the Associated ID number to the Sub-Table . |
 |
|
MikeB
Constraint Violating Yak Guru
387 Posts |
Posted - 2003-03-21 : 14:46:32
|
quote: Please ... if you can't follow the argument (if you are thinking "so you are saying an identity field is a natural key?" or "what if a customer leaves and comes back and the identity field gives them a new number?" or thinking anything about database related identity or autonumber fields) please don't reply ....
Man, you guys are too smart for me! I have a hard time following what you are talking about! lol This is primarily my argument! If I assign the contractor a number to be identified by on all materials related to that contractor then why would it not be a natural key? Sure it does suck to say "Yeah, 1441 is constructing that building" but if it makes for quick reference to all material related and doesn't present possible clashes, then why not! I use the Autonumber field for project errors, notes, anything I believe does not have a good primary key. A contractor name could be good if it was definately unique but it is not. There can be "Good Contractors 4 U" in both Detroit, Michigan and in Orlando, Florida, There could also very well be 2 of the same in the same city.It is not that I am arguing simply to justify my use of Autonumber for primary key, I just want to understand completely the pros/cons of it.Mike B |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2003-03-21 : 15:54:24
|
Mike -- My little plea probably came off the wrong way, wasn't meant for you ... there was another discussion may moons ago on these forums where I repeated the same argument over and over and over and over again and again and it get very tiresome because no one was getting what I was saying.In retrospect, I realize it was probably due to the way I presented it, not being clear enough and all that. So I tried to be as clear as possible in my little post above so hopefully I would not get stuck in the same trap again ...(if you do a search you may find the above-mentioned thread ... it's a pretty good read, has good points made by all, but my repeated posts saying the same thing would probably be a tiresome read !)- Jeff |
 |
|
|