Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 cleaning file data using Tsql

Author  Topic 

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-23 : 05:30:19
Hi All,

I have a file as below



C|D|2008-09-15|A|F|DELIV|A1B1C1
C|D|2008-09-15|A|F|DELIV|
X1
Y1
C1
C|D|2008-09-15|A|F|DELIV|A1A2
A3
C|D|2008-09-15|A|F
|DELIV|PP1PP2PP3



I have to clean this file with such records above to as below
I have taken all the file data as a single column in a table
trying to clean it, can some one help me in this regard

C|D|2008-09-15|A|F|DELIV|A1B1C1
C|D|2008-09-15|A|F|DELIV| X1 Y1 C1
C|D|2008-09-15|A|F|DELIV|A1A2 A3
C|D|2008-09-15|A|F |DELIV|PP1PP2PP3


aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-23 : 05:32:46
What I have done is as below

Take entire file as a single column in a table
delimiter count remains same for all the corrected data, can we do some thing based on delimiter count to correct the file
the acutal file has 3Million records.

Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2010-03-23 : 05:34:21
I would really not do that using SQL.
I would try something like Perl to get the file right and then import it.


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-23 : 07:18:47
Can we write a procedure for this, using loop to loop through all the records one by one and cleaning and putting it into a table
Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-23 : 07:20:00
quote:
Originally posted by webfred

I would really not do that using SQL.
I would try something like Perl to get the file right and then import it.


No, you're never too old to Yak'n'Roll if you're too young to die.



Thanks Webfred,

I am not aware of perl, do you have any such script?

Regards,
aak
Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-24 : 07:12:19
Hi Webfred,

Below is the code which does this.


DECLARE @X as table(rn int identity(1,1),txt varchar(max))

INSERT INTO @X(txt)
select 'ABC|AVC1|1234|' union all
select 'xyz|abc' union all
select 'ABC1|AVC1|12334|' union all
select 'x' union all
select 'y' union all
select 'z' union all
select '|ABC' union all
select 'ABC1|AVC1|12334|xyz|ABC' union all
select 'XYZ'

DECLARE @max bigint,@c bigint
SELECT @max = MAX(rn),@c = 1 from @X
DECLARE @targ varchar(max)

WHILE @max <> 0
BEGIN
SELECT @targ = txt FROM @X WHERE rn = @max

IF (@targ not like '%|%|%|%|%')
BEGIN
UPDATE @X SET txt = txt+@targ WHERE rn = @max - 1
DELETE FROM @X WHERE rn = @max
END
SET @max = @max - 1
END

SELECT * FROM @X
Go to Top of Page
   

- Advertisement -