Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Database Structure Question

Author  Topic 

gloosemo
Starting Member

1 Post

Posted - 2013-09-06 : 00:36:00
I don't know the best way to structure my database.

I have a data feed that comes in a CSV format, its financial data about 65,000 different mutual funds with daily prices. In each CSV i get, each mutual fund is listed once with the price for the current day.

What i do first is i import the CSV. But then what i want is a table for each mutual fund with prices and dates inside each table. this means i have to create 65,000 tables in a separate database, then every day read from the imported CSV file and write to the tables. The problem is this ajax query takes almost an hour each time (for each imported CSV or in other words each day)

The logic behind this is when i want to forecast data on an individual fund later, i can just go to that fund's table, query the data out, and do what i want with the prices for that fund over a range of time which is the whole point.

The alternative approach i think is instead of creating 65,000 tables, just leave the imported CSV tables and draw data from there on demand. To do that though i would have to join together all my imported CSV with each query before grabbing fund data, because i'd want data over a course of many months even years for a particular fund.

Which approach is the most efficient at runtime? The way i have it now with the 65,000 tables i have an index on the date column in each case. I'm thinking if i did the other option and just queried the imports, i'd need some sort of dynamic way to index them once they're joined so i can use the date column effectively. I would need that i think. I'd need to have them all indexed by fund name and then by date in order to efficiently grab data from a query that would literally be billions or maybe trillions of rows. Is there a dynamic way to do indexes when you join tables?

Your thoughts are much appreciated. Thanks.

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2013-09-06 : 01:13:04
You need one table with a fund identifier column.

There's probably no performance problem with doing that but if you do encounter one you can consider partitions withing fund or date.
No offence, but some of your comments indicate you need to study relational databases a bit better before you jump into this. It's a really easy problem and a good fit once you know your way around.
Go to Top of Page
   

- Advertisement -