Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
tiger68
Starting Member
6 Posts |
Posted - 2004-09-10 : 14:52:29
|
Hi everyone,I haven't done much of this, so bear with me please.We have a few text files (reports that are generated from a legacy system) that we would like to bring data into our SQL server database, the files are in a format like this:Date 12/03/03 LINE01 Page 1--------------------------------------------------------------------------------ITEM LINE NEXT FIRST LAST SHIP NUMBER SEQ LINE COMMENT COMMENT DATE--------------------------------------------------------------------------------h178838 001 0000 2337 2337 00/00/00A162151 001 0000 0000 0000 00/00/00A457117 001 0000 0000 0000 00/00/00C227907 001 0000 2167 2167 00/00/00Date 12/03/03 LINE01 Page 545--------------------------------------------------------------------------------ITEM LINE NEXT FIRST LAST SHIPNUMBER SEQ LINE COMMENT COMMENT DATE--------------------------------------------------------------------------------H585602 001 0000 5400 5400 00/00/00H585736 001 0000 0000 0000 00/00/00H585854 001 0000 0000 0000 00/00/00H585952 001 0000 0000 0000 00/00/00Well, of course each page has more records, I am using 3,4 records as an example here (and each file has a few hundred pages) and hope you know what I mean. I understand you can use DTS for data import, but how do I get rid of those lines that we don't need (column headers, page numbers, dots, and boxes???), I got errors when I tried it. (Notes: When I did a preview, I found out the format is not exactly like what we have. In our files, it's more like fixed field other than the labels, and there are some funny squares/boxes at the end of the page numbers and dots)Any help would be appreciated, thanks in advance.Kim |
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-09-10 : 15:25:56
|
Do you have books online?Your data should go in to a staging table that has 1 column varchar(8000)You then need to use sql statement to parse out the data.Text has a very different meaning in sql then it does in access....I'd bcp the data in, but dts will be easier for you to start...first create a table in sql server, then dts the whole file to that...Once you get that done, let us know...I'll subscribe to this threadBrett8-) |
|
|
tiger68
Starting Member
6 Posts |
Posted - 2004-09-10 : 15:46:40
|
Brett,First of all, thank you so much for the prompt response.Yes, I do have books online, and I think I just did what you have instructed - creating a staging table that has 1 column, I then used dts to import the entire file, including all the labels and rows (I chose fixed field, is that correct?), looks like it brought eveything into the table. I guess I am ready the next step...Master.Student,Kim :-) |
|
|
nr
SQLTeam MVY
12543 Posts |
Posted - 2004-09-10 : 15:58:45
|
seehttp://www.nigelrivett.net/ImportTextFiles.htmlI think that does everything you need.Just delete the rows you don't want or put a filter on the insert.==========================================Cursors are useful if you don't know sql.DTS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-09-10 : 16:13:25
|
Yeah, what Nigel said....I guess I can blog my Fixed width file loader routine that removes headers and trailers...It also archives files and looks to a event table for which files to load...I guess that might might be far enough away from plagarismBrett8-) |
|
|
tiger68
Starting Member
6 Posts |
Posted - 2004-09-10 : 16:16:41
|
NR,Thanks for the reply, I'm gonna have to spend some time play with this and see how it works (Hope I can figure out each of the steps).:-)Kim |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-09-10 : 16:19:14
|
To recap1. table varchar(8000)(Nigel shows a nice method with BULK INSERT and dynamic SQL...DTS is such a pain sime times)2. Remove the unwanted data...in this case the filename, or a header3. Move the data to it's final destination...with substring commandsProblems...LARGE Volumes...the INSERTS will be loggedA clean file on the way in is the best...but you have to make do with what you get...For example though, if a large mainframe process is spitting out the data, it might just be better to ask them to remove it...Although this sql sh-t is fun....Brett8-) |
|
|
tiger68
Starting Member
6 Posts |
Posted - 2004-09-10 : 16:34:35
|
Brett,I haven't got everything to work yet, but as I am getting into this, it seems there is a lot you can do. I hope I can find my way out with the help you guys provide, and I sure appreciate every effort.I will definitely spend some time this weekend trying to make it happen.Thanks,Kim |
|
|
tiger68
Starting Member
6 Posts |
Posted - 2004-09-13 : 11:35:02
|
Hey Brett/Nigel,I tried your approach and it definitely help me to have a much better understanding how data import works, however, is there any way for me to get rid of those headers/labels and stuff, especially the ones in between? here is how my files look like:Date 12/03/03 LINE01 Page 1--------------------------------------------------------------------------------ITEM LINE NEXT FIRST LAST SHIP NUMBER SEQ LINE COMMENT COMMENT DATE--------------------------------------------------------------------------------h178838 001 0000 2337 2337 00/00/00A162151 001 0000 0000 0000 00/00/00A457117 001 0000 0000 0000 00/00/00C227907 001 0000 2167 2167 00/00/00Date 12/03/03 LINE01 Page 2--------------------------------------------------------------------------------ITEM LINE NEXT FIRST LAST SHIP NUMBER SEQ LINE COMMENT COMMENT DATE--------------------------------------------------------------------------------D857722 001 0000 0000 0000 00/00/00D857748 001 0000 0000 0000 00/00/00D857749 001 0000 0000 0000 00/00/00D857785 001 0000 0000 0000 00/00/00D857808 001 0000 0000 0000 00/00/00Date 12/03/03 LINE01 Page 545--------------------------------------------------------------------------------ITEM LINE NEXT FIRST LAST SHIPNUMBER SEQ LINE COMMENT COMMENT DATE--------------------------------------------------------------------------------H585602 001 0000 5400 5400 00/00/00H585736 001 0000 0000 0000 00/00/00H585854 001 0000 0000 0000 00/00/00H585952 001 0000 0000 0000 00/00/00 * GRAND TOTAL* Average* Record Count = 26680* Average*** TOTAL RECORD COUNT =26680 Obviously, there is a lot more data to it, but hope you get the idea (and what are those boxes at the end of the dot lines) ...Thanks,Kim |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2004-09-13 : 12:53:53
|
Assume the insert is the loadUSE NorthwindGOSET NOCOUNT ONCREATE TABLE myTable99(Col1 int IDENTITY(1,1), Data varchar(7500))GOINSERT INTO myTable99(Data)SELECT 'Date 12/03/03 LINE01 Page 1' UNION ALLSELECT '' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT 'ITEM LINE NEXT FIRST LAST SHIP ' UNION ALLSELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT '' UNION ALLSELECT '' UNION ALLSELECT 'h178838 001 0000 2337 2337 00/00/00' UNION ALLSELECT 'A162151 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'A457117 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'C227907 001 0000 2167 2167 00/00/00' UNION ALLSELECT 'Date 12/03/03 LINE01 Page 2' UNION ALLSELECT '' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT 'ITEM LINE NEXT FIRST LAST SHIP ' UNION ALLSELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT '' UNION ALLSELECT 'D857722 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'D857748 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'D857749 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'D857785 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'D857808 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'Date 12/03/03 LINE01 Page 545' UNION ALLSELECT '' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT 'ITEM LINE NEXT FIRST LAST SHIP' UNION ALLSELECT 'NUMBER SEQ LINE COMMENT COMMENT DATE' UNION ALLSELECT '--------------------------------------------------------------------------------' UNION ALLSELECT '' UNION ALLSELECT 'H585602 001 0000 5400 5400 00/00/00' UNION ALLSELECT 'H585736 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'H585854 001 0000 0000 0000 00/00/00' UNION ALLSELECT 'H585952 001 0000 0000 0000 00/00/00' UNION ALLSELECT ' * GRAND TOTAL' UNION ALLSELECT '* Average' UNION ALLSELECT '* Record Count = 26680' UNION ALLSELECT '* Average' UNION ALLSELECT '*** TOTAL RECORD COUNT =26680'GO-- Understand, I think we'll be lucky if the order of rows are the same, especially wih a loadSELECT * FROM myTable99 ORDER BY Col1GOSELECT * FROM myTable99 WHERE SUBSTRING(Data,1,4) <> 'Date' AND SUBSTRING(Data,1,4) <> '----' AND SUBSTRING(Data,1,4) <> 'ITEM' AND SUBSTRING(Data,1,4) <> 'NUMB' AND SUBSTRING(Data,1,4) <> '* Av' AND SUBSTRING(Data,1,4) <> '* Re' AND SUBSTRING(Data,1,9) <> '*** TOTAL' AND SUBSTRING(Data,7,13) <> '* GRAND TOTAL' AND Data <> '' AND Data IS NOT NULLORDER BY Col1GO-- Wondering what I'm doing wrong here...but you ge the idea...SELECT ASCII(SUBSTRING(Data,1,1)) FROM myTable99 WHERE Col1 = 9SELECT * FROM myTable99 WHERE ASCII(SUBSTRING(Data,1,1)) = 12SELECT Col1, REPLACE(Data,ASCII(12),'') FROM myTable99GOSET NOCOUNT ONDROP TABLE myTable99GO Brett8-) |
|
|
tiger68
Starting Member
6 Posts |
Posted - 2004-09-15 : 00:44:30
|
Brett / Nigel,Thank you both so much for your help. Finally I am able to get the data into my tables. Basically I was just following the instructions in Nigel's link, create the table, then SP that import and strip the data, and run the SP, all done in Query Analyzer, this is great. I do have a little problem, I have a YES/NO flag field that I define in SQL Server as char(3) because there are 2 possible values YES and NO coming from the text file, when I do the import, there is always a square (box) after the NO value, I assume this is from the text file, is there any way that I can get rid of that? I tried RTrim but it didnt seem to work.My next step is trying to make this an automated nightly process, I guess I would need to write a script or something and I could just create the SP one time without worrying about recreating it every time, but my knowledge of SQL Server is so limited, would someone please tell me, from your experience, what the best way is to accomplish this?Thanks,Kim |
|
|
|
|
|
|
|