ideas and hints for huge data needed.

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Import/Export (DTS) and Replication (2000)

ideas and hints for huge data needed.

Author

Topic

dpais
Yak Posting Veteran

60 Posts

Posted - 2008-08-19 : 10:03:45

Hi

i am trying to insert a huge cvs into sql server 2000 i.e CVS contains 34 million rows with 10 columns : -

here are my questions -

1. My CVS is 2 gigs in size but when i DTS it into SQl the backend data file (.mdf file) is 15 gigs in size please explain ?? also the log file is 4 gigs in size - that's HUGE !!!!! or am i doing something wrong.

2. Is there any other way i can import data - how different is a bulk insert from a regular DTS import ?? will my insertion time improve - right now it takes 30 mins to import one 34 million row file.

3. running a select statement on this kind of table is rediculous - i tried using a simple select statement to query out 80,000 records and it took 4 minutes to run ....

4. This might be the bummer of it all -- i am running sql server on a pentium 4- 3.2 GHZ machine with 1 GB ram ..... sorry no time to upgrade ..... will do the upgrade in a few months.

as you have noticed i have very little experience as an ameture buk data handler - any direction is appreciated ....

i have to import 6 other CVS's that are more than 34 million records each. so thats a total of a 170 million rows.

thanks in advance for the help .... negative criticisim will be accepted.

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2008-08-19 : 10:09:52

1)Yes Huge insert will make your transaction log file size bigger. If you don't care about Point in time recovery,change it to simple recovery model
2)Yes,Bulk insert is faster as it does in bulk.
3)After inserting , You need to add index if its not there.
4)Looks like you need more RAM.

rmiao
Master Smack Fu Yak Hacker

7266 Posts

Posted - 2008-08-19 : 23:13:34

1. if table has several indexes, 2gb raw data may result big table size. And you can control log size by loading data in smaller batch and backup log between batches.
2. adjust rows_per_batch in bulk insert, you should find right value to speed the process.
3. select without where clause will result table scan, even table has index.
4. keep in mind that sql2k standard edition can only use 2gb memory.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources