Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Data Structure question

Author  Topic 

nic
Posting Yak Master

209 Posts

Posted - 2007-10-04 : 15:28:58
Hi,
I have a question on the best way to lay out a table structure. Although I like structure 1 from a relational point of view, from a shear performance point of view structure 2 might be better. I am very comfortable working with 5-10 million record tables but with option 1 the tables could be easily 50 million, so I wanted to get another opinion.

This is releated to insurance rating and vehicle VINs. Each year/make/model has a unique VIN. Each insurance company may assign special "info" about that vehicle. There could be 1-6 pieces of special information per VIN (this is based per insurance company)

Structure 1

Table 1
companyID (int - compound key)
VIN (varchar - compound key)
infoType (varchar - compound key)
infoValue (varchar)

sample records

Company VIN InfoType InfoValue
1 1212121 BI 01
1 1212121 PD A7
1 1212121 COMP 07
2 1212121 BI 04
3 1212121 COMP 32
3 1212121 COLL 3U


Since we are using name/value pairs we have complete flexibility with InfoTypes and don't have null records
Downside, due to the shear number of record/combinations this table could get to be 50 million+ records


Structure 2

Table 2
companyID (int - compound key)
VIN (varchar - compound key)
BI (varchar)
PD (varchar)
COMP (varchar)
COLL (varchar)

sample records

Company VIN BI PD COMP COLL
1 1212121 01 A7 07 null
2 1212121 04 null null null
3 1212121 null null 32 3U


Since we have specific columns for each type of info, the table strucure is more rigid but will be much smaller (2 million records). The problem is we now have multiple columns with lots of nulls (indicating poor design). Another problem is some companies have realy crazy info types. Instead of adding more columns we could have 1-2 additional columns which would store this random info on a per company basis and this column becomes more of a catchholder which is necessary since the InfoTypes are predefined.

In the end, I like the flexibility of structure 1, but due to the size, structure 2 might be better. The queries to this table need to be very quick. From a speed/size point of view (with millions of records), what is better?

What are people's thoughts?

Note: we will need to have joins to this table etc so size could be more of a problem.

Nic

nic
Posting Yak Master

209 Posts

Posted - 2007-10-04 : 17:44:41
On second thought, it is more a possibility between ~4 million records (option 2) and ~20 million records (option 1). 50 million was an exageration. Any takers???

Nic
Go to Top of Page
   

- Advertisement -