Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 help needed for correcting file data usingsqltable

Author  Topic 

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-18 : 07:24:42
Hi all,

I have loaded a file content in to a single table with single column,the file is having data as below



Record 1. ABC|AVC1|1234|
xyz|abc
Record 2. ABC1|AVC1|12334|
xyz
|ABC
Record 3. ABC1|AVC1|12334|xyz|ABC



I need to correct the above data in the table, the records 3. is the correct format which I want record1 and record2 to be like.

Note Record1..3. text I have added above. ( this is just an e.g. actual table has 5million rec)

Can any one of you help me in correcting this file using plsql or sql query by creating another table from this one, which has corrected data

Logic I am think is

Every records must have 4 delimiter if less then bring the next line record to the previous line e.g.

Record 1. ABC|AVC1|1234|xyz|abc

Even if there are no delimiter on the line then also bring the next line record to the previous line e.g.

Record 2. ABC1|AVC1|12334| xyz
|ABC


Like wise cleaning the entire table and placing it in a new table, to export.

I have sql 2005

Regards,
aak

vaibhavktiwari83
Aged Yak Warrior

843 Posts

Posted - 2010-03-18 : 07:35:47
In record1 and record2 there are 3 and 2 rows respectively or single row data

Vaibhav T
Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-18 : 07:54:08
actual output should have only 3 rows record1 record2 and record3, other rows are just data has come in the next line as a new row.

yes it should be single row data, but has splitted into different rows while extraction, need to correct it at our end using sql or tsql
Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-18 : 08:43:28
Oracle does this as below, can any one help me in converting the below to sql


1 with t as (select 1 as rn, 'ABC|AVC1|1234|' as txt from dual union all
2 select 2, 'xyz|abc' from dual union all
3 select 3, 'ABC1|AVC1|12334|' from dual union all
4 select 4, 'xyz' from dual union all
5 select 5, '|ABC' from dual union all
6 select 6, 'ABC1|AVC1|12334|xyz|ABC' from dual)
7 -- END OF TEST DATA
8 select rn, txt
9 from (
10 select rn, regexp_substr(txt, '^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+') as txt
11 ,case when s < nvl(lag(s) over (order by rn),999) then 1 else 0 end as s
12 from (
13 select rn, replace(txt||'|'||lead(txt) over (order by rn)
14 ||'|'||lead(txt,2) over (order by rn)
15 ||'|'||lead(txt,3) over (order by rn)
16 ||'|'||lead(txt,4) over (order by rn),'||','|') as txt
17 ,mod(sum(length(regexp_replace(txt,'[^|]'))) over (order by rn),5) as s
18 from t
19 )
20 )
21 where s = 1
22* order by rn
SQL> /

RN TXT
---------- ------------------------------
1 ABC|AVC1|1234|xyz|abc
3 ABC1|AVC1|12334|xyz|ABC
6 ABC1|AVC1|12334|xyz|ABC

SQL>



ignoring the WITH clause the query (which is just test data and saves me creating a table on my database) and replace "t" in the main query with the name of table


Go to Top of Page

aakcse
Aged Yak Warrior

570 Posts

Posted - 2010-03-18 : 10:47:04
can any one help me in writing a procedure to carry out this.

Logic would be as below

1. Insert into cleaned table if the no of delimiter are ok ( say per line we have X no of delimiter we can pass this as a in parameter)

a. in loop count the delimiter in line if ok, then count the delimiter in the next line if ok then insert this record into cleaned table


2. if the above condition fails, than bring the next line record to the previous line until the delimiter count are ok

like wise continue for all the records in the table

Regards,
aak.
Go to Top of Page
   

- Advertisement -