Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
sqlfresher2k7
Aged Yak Warrior
623 Posts |
Posted - 2007-11-26 : 21:58:54
|
| There was an existing database design, which contains 4 tables.The three table contains the duplicate data.Table A contains 100 columnsTable B contains 80 columnsTable C contains 50 columnsThe data of 3 tables(1,2,3) will be de-duplicated,cleansed and loaded to Table D which contains 138 columns. My question is Table1,Table2,Table3 doesn't have a primary key defined.Table4 has a primary key.What is impact of not having a primary keys.What are the potential problem to have table4 colums of 138.Performance issues. I proposed to denormalize the database.since they don't want at this stage.I also proposed to have primary keys for the three tables..Please give me the valuable suggestions..... |
|
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2007-11-26 : 23:42:14
|
| A primary key is a unique value that is used to identify a specific record. When you create a primary key, a clustered index is created on that column allowing you to quickly referance that record. Simply having a primary key on a table does not increase performance in any way, unless you are referancing/joining a row using that key value, this is where a lot of people make mistakes in designing their database.As a basic rule, I only define primary keys when that table holds a value that is unique to that specific table, but referanced in other tables.Otherwise you should just create index's on the fields that you will be referancing the records with.For your situation, if there is no Key/ID that exists to identify records, then you need to use what columns you have to try and find the duplicates, but make sure you use one in table 4 if you are loading the records from scratch. It actually seems like a better idea to normalize your data from what I have read, rather then have the 3 tables if they all basically are holding overlapping data. Keep in mind that a identity column is often used as a primary key, but is rarly the best avenue to choose from (It is the easiest and probably quickest though), look into newID(), NewSequentialID(), and/or creating your own primary key that is unique to that record. |
 |
|
|
|
|
|
|
|