Data Structure question - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Data Structure question

Author

Topic

nic
Posting Yak Master

209 Posts

Posted - 2007-10-04 : 15:28:58

Hi,
I have a question on the best way to lay out a table structure. Although I like structure 1 from a relational point of view, from a shear performance point of view structure 2 might be better. I am very comfortable working with 5-10 million record tables but with option 1 the tables could be easily 50 million, so I wanted to get another opinion.

This is releated to insurance rating and vehicle VINs. Each year/make/model has a unique VIN. Each insurance company may assign special "info" about that vehicle. There could be 1-6 pieces of special information per VIN (this is based per insurance company)


Structure 1

Table 1
companyID (int - compound key)
VIN (varchar - compound key)
infoType (varchar - compound key)
infoValue (varchar)

sample records

Company		VIN	InfoType	InfoValue 
1		1212121 BI		01
1		1212121	PD		A7
1		1212121 COMP		07
2		1212121	BI		04
3		1212121	COMP		32
3		1212121	COLL		3U

Since we are using name/value pairs we have complete flexibility with InfoTypes and don't have null records
Downside, due to the shear number of record/combinations this table could get to be 50 million+ records


Structure 2

Table 2
companyID (int - compound key)
VIN (varchar - compound key)
BI (varchar)
PD (varchar)
COMP (varchar)
COLL (varchar)

sample records

Company		VIN	BI	PD	COMP 	COLL
1		1212121 01	A7	07	null
2		1212121	04	null	null	null		
3		1212121	null	null	32	3U

Since we have specific columns for each type of info, the table strucure is more rigid but will be much smaller (2 million records). The problem is we now have multiple columns with lots of nulls (indicating poor design). Another problem is some companies have realy crazy info types. Instead of adding more columns we could have 1-2 additional columns which would store this random info on a per company basis and this column becomes more of a catchholder which is necessary since the InfoTypes are predefined.

In the end, I like the flexibility of structure 1, but due to the size, structure 2 might be better. The queries to this table need to be very quick. From a speed/size point of view (with millions of records), what is better?

What are people's thoughts?

Note: we will need to have joins to this table etc so size could be more of a problem.

Nic

nic
Posting Yak Master

209 Posts

Posted - 2007-10-04 : 17:44:41

On second thought, it is more a possibility between ~4 million records (option 2) and ~20 million records (option 1). 50 million was an exageration. Any takers???

Nic

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources