Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Text file to Sql Server

Author  Topic 

tiger68
Starting Member

6 Posts

Posted - 2004-09-10 : 14:52:29
Hi everyone,

I haven't done much of this, so bear with me please.

We have a few text files (reports that are generated from a legacy system) that we would like to bring data into our SQL server database, the files are in a format like this:

Date 12/03/03 LINE01 Page 1

--------------------------------------------------------------------------------
ITEM LINE NEXT FIRST LAST SHIP
NUMBER SEQ LINE COMMENT COMMENT DATE
--------------------------------------------------------------------------------


h178838 001 0000 2337 2337 00/00/00
A162151 001 0000 0000 0000 00/00/00
A457117 001 0000 0000 0000 00/00/00
C227907 001 0000 2167 2167 00/00/00
Date 12/03/03 LINE01 Page 545

--------------------------------------------------------------------------------
ITEM LINE NEXT FIRST LAST SHIP
NUMBER SEQ LINE COMMENT COMMENT DATE
--------------------------------------------------------------------------------

H585602 001 0000 5400 5400 00/00/00
H585736 001 0000 0000 0000 00/00/00
H585854 001 0000 0000 0000 00/00/00
H585952 001 0000 0000 0000 00/00/00


Well, of course each page has more records, I am using 3,4 records as an example here (and each file has a few hundred pages) and hope you know what I mean. I understand you can use DTS for data import, but how do I get rid of those lines that we don't need (column headers, page numbers, dots, and boxes???), I got errors when I tried it.

(Notes: When I did a preview, I found out the format is not exactly like what we have. In our files, it's more like fixed field other than the labels, and there are some funny squares/boxes at the end of the page numbers and dots)

Any help would be appreciated, thanks in advance.

Kim

X002548
Not Just a Number

15586 Posts

Posted - 2004-09-10 : 15:25:56
Do you have books online?

Your data should go in to a staging table that has 1 column varchar(8000)

You then need to use sql statement to parse out the data.

Text has a very different meaning in sql then it does in access....

I'd bcp the data in, but dts will be easier for you to start...

first create a table in sql server, then dts the whole file to that...

Once you get that done, let us know...

I'll subscribe to this thread



Brett

8-)
Go to Top of Page

tiger68
Starting Member

6 Posts

Posted - 2004-09-10 : 15:46:40
Brett,

First of all, thank you so much for the prompt response.

Yes, I do have books online, and I think I just did what you have instructed - creating a staging table that has 1 column, I then used dts to import the entire file, including all the labels and rows (I chose fixed field, is that correct?), looks like it brought eveything into the table. I guess I am ready the next step...Master.

Student,
Kim :-)
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2004-09-10 : 15:58:45
see
http://www.nigelrivett.net/ImportTextFiles.html

I think that does everything you need.
Just delete the rows you don't want or put a filter on the insert.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-09-10 : 16:13:25
Yeah, what Nigel said....

I guess I can blog my Fixed width file loader routine that removes headers and trailers...It also archives files and looks to a event table for which files to load...

I guess that might might be far enough away from plagarism

Brett

8-)
Go to Top of Page

tiger68
Starting Member

6 Posts

Posted - 2004-09-10 : 16:16:41
NR,

Thanks for the reply, I'm gonna have to spend some time play with this and see how it works (Hope I can figure out each of the steps).

:-)

Kim
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-09-10 : 16:19:14
To recap

1. table varchar(8000)
(Nigel shows a nice method with BULK INSERT and dynamic SQL...DTS is such a pain sime times)
2. Remove the unwanted data...in this case the filename, or a header
3. Move the data to it's final destination...with substring commands

Problems...LARGE Volumes...the INSERTS will be logged

A clean file on the way in is the best...but you have to make do with what you get...

For example though, if a large mainframe process is spitting out the data, it might just be better to ask them to remove it...

Although this sql sh-t is fun....



Brett

8-)
Go to Top of Page

tiger68
Starting Member

6 Posts

Posted - 2004-09-10 : 16:34:35
Brett,

I haven't got everything to work yet, but as I am getting into this, it seems there is a lot you can do. I hope I can find my way out with the help you guys provide, and I sure appreciate every effort.

I will definitely spend some time this weekend trying to make it happen.

Thanks,
Kim
Go to Top of Page

tiger68
Starting Member

6 Posts

Posted - 2004-09-13 : 11:35:02
Hey Brett/Nigel,

I tried your approach and it definitely help me to have a much better understanding how data import works, however, is there any way for me to get rid of those headers/labels and stuff, especially the ones in between? here is how my files look like:



Date 12/03/03 LINE01 Page 1

--------------------------------------------------------------------------------
ITEM LINE NEXT FIRST LAST SHIP
NUMBER SEQ LINE COMMENT COMMENT DATE
--------------------------------------------------------------------------------


h178838 001 0000 2337 2337 00/00/00
A162151 001 0000 0000 0000 00/00/00
A457117 001 0000 0000 0000 00/00/00
C227907 001 0000 2167 2167 00/00/00
Date 12/03/03 LINE01 Page 2

--------------------------------------------------------------------------------
ITEM LINE NEXT FIRST LAST SHIP
NUMBER SEQ LINE COMMENT COMMENT DATE
--------------------------------------------------------------------------------

D857722 001 0000 0000 0000 00/00/00
D857748 001 0000 0000 0000 00/00/00
D857749 001 0000 0000 0000 00/00/00
D857785 001 0000 0000 0000 00/00/00
D857808 001 0000 0000 0000 00/00/00
Date 12/03/03 LINE01 Page 545

--------------------------------------------------------------------------------
ITEM LINE NEXT FIRST LAST SHIP
NUMBER SEQ LINE COMMENT COMMENT DATE
--------------------------------------------------------------------------------

H585602 001 0000 5400 5400 00/00/00
H585736 001 0000 0000 0000 00/00/00
H585854 001 0000 0000 0000 00/00/00
H585952 001 0000 0000 0000 00/00/00
* GRAND TOTAL
* Average
* Record Count = 26680
* Average
*** TOTAL RECORD COUNT =26680


Obviously, there is a lot more data to it, but hope you get the idea (and what are those boxes at the end of the dot lines) ...

Thanks,
Kim
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-09-13 : 12:53:53
Assume the insert is the load


USE Northwind
GO

SET NOCOUNT ON
CREATE TABLE myTable99(Col1 int IDENTITY(1,1), Data varchar(7500))
GO

INSERT INTO myTable99(Data)
SELECT 'Date 12/03/03 LINE01 Page 1 ' UNION ALL
SELECT '' UNION ALL
SELECT '--------------------------------------------------------------------------------' UNION ALL
SELECT 'ITEM LINE NEXT FIRST LAST SHIP ' UNION ALL
SELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALL
SELECT '-------------------------------------------------------------------------------- ' UNION ALL
SELECT '' UNION ALL
SELECT '' UNION ALL
SELECT ' h178838 001 0000 2337 2337 00/00/00' UNION ALL
SELECT 'A162151 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'A457117 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'C227907 001 0000 2167 2167 00/00/00 ' UNION ALL
SELECT 'Date 12/03/03 LINE01 Page 2 ' UNION ALL
SELECT '' UNION ALL
SELECT '--------------------------------------------------------------------------------' UNION ALL
SELECT 'ITEM LINE NEXT FIRST LAST SHIP ' UNION ALL
SELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALL
SELECT '-------------------------------------------------------------------------------- ' UNION ALL
SELECT '' UNION ALL
SELECT 'D857722 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'D857748 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'D857749 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'D857785 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'D857808 001 0000 0000 0000 00/00/00 ' UNION ALL
SELECT 'Date 12/03/03 LINE01 Page 545 ' UNION ALL
SELECT '' UNION ALL
SELECT '--------------------------------------------------------------------------------' UNION ALL
SELECT 'ITEM LINE NEXT FIRST LAST SHIP' UNION ALL
SELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALL
SELECT '-------------------------------------------------------------------------------- ' UNION ALL
SELECT '' UNION ALL
SELECT 'H585602 001 0000 5400 5400 00/00/00' UNION ALL
SELECT 'H585736 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'H585854 001 0000 0000 0000 00/00/00' UNION ALL
SELECT 'H585952 001 0000 0000 0000 00/00/00' UNION ALL
SELECT ' * GRAND TOTAL' UNION ALL
SELECT '* Average' UNION ALL
SELECT '* Record Count = 26680 ' UNION ALL
SELECT '* Average' UNION ALL
SELECT '*** TOTAL RECORD COUNT =26680'
GO

-- Understand, I think we'll be lucky if the order of rows are the same, especially wih a load
SELECT * FROM myTable99 ORDER BY Col1
GO

SELECT *
FROM myTable99
WHERE SUBSTRING(Data,1,4) <> 'Date'
AND SUBSTRING(Data,1,4) <> '----'
AND SUBSTRING(Data,1,4) <> 'ITEM'
AND SUBSTRING(Data,1,4) <> 'NUMB'
AND SUBSTRING(Data,1,4) <> '* Av'
AND SUBSTRING(Data,1,4) <> '* Re'
AND SUBSTRING(Data,1,9) <> '*** TOTAL'
AND SUBSTRING(Data,7,13) <> '* GRAND TOTAL'
AND Data <> ''
AND Data IS NOT NULL
ORDER BY Col1
GO

-- Wondering what I'm doing wrong here...but you ge the idea...
SELECT ASCII(SUBSTRING(Data,1,1)) FROM myTable99 WHERE Col1 = 9
SELECT * FROM myTable99 WHERE ASCII(SUBSTRING(Data,1,1)) = 12
SELECT Col1, REPLACE(Data,ASCII(12),'') FROM myTable99
GO

SET NOCOUNT ON
DROP TABLE myTable99
GO




Brett

8-)
Go to Top of Page

tiger68
Starting Member

6 Posts

Posted - 2004-09-15 : 00:44:30
Brett / Nigel,

Thank you both so much for your help. Finally I am able to get the data into my tables. Basically I was just following the instructions in Nigel's link, create the table, then SP that import and strip the data, and run the SP, all done in Query Analyzer, this is great. I do have a little problem, I have a YES/NO flag field that I define in SQL Server as char(3) because there are 2 possible values YES and NO coming from the text file, when I do the import, there is always a square (box) after the NO value, I assume this is from the text file, is there any way that I can get rid of that? I tried RTrim but it didnt seem to work.

My next step is trying to make this an automated nightly process, I guess I would need to write a script or something and I could just create the SP one time without worrying about recreating it every time, but my knowledge of SQL Server is so limited, would someone please tell me, from your experience, what the best way is to accomplish this?

Thanks,
Kim
Go to Top of Page
   

- Advertisement -