| Author |
Topic |
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 03:16:21
|
| I have a complicated .xml file which I would like to pull the data out of.There are places where it has <![CDATA[ ]]>And so I can not use sql server 2005 xml task...So I have decided to use openxml to go through the xml contents and pull the data out and place in appropriate tables...Is this a good way to approach my requirement? |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 03:54:40
|
| Any problems in using new xml datatype available in sql 2005? |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 04:10:30
|
| I have placed the xml file into the xml field of a table.This is a sample of the xml file I am using (See below)I am trying to pull all the data out and place into relevant tables...This is the kind of sql I am working on:Can you see what I am doing wrong because I do not get the values out, thanks--sqldeclare @XMLdoc xmldeclare @idoc intdeclare @Form varchar(20)declare @Issuer varchar(100)select @XMLdoc = dataXML from #tblImportXMLexec sp_xml_preparedocument @idoc output, @XMLdocselect @Form = Formfrom openxml (@idoc, '/profilelist/profile', 2) with ( Form varchar(20) )select @Form as Formselect @Issuer = Issuerfrom openxml (@idoc, '/profilelist/profile', 2) with ( Issuer varchar(100) )select @Issuer as Issuer ----xml<profilelist> <profile> <Form>Input</Form> <Issuer>top of the range company</Issuer> <ProgrammeTitle>US$ 20,000,000,000 Global Commercial Paper Programme `</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <CD /> <MTN /> <Option144A /> <IssuerSector> <value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector>.........</profile><profile> <Form>Input</Form> <Issuer>another company</Issuer> <ProgrammeTitle>some value here</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> </profile></profilelist> |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 04:21:19
|
you need only thisdeclare @XMLdoc xmlselect @XMLdoc = '<profilelist><profile><Form>Input</Form> <Issuer>top of the range company</Issuer> <ProgrammeTitle>US$ 20,000,000,000 Global Commercial Paper Programme `</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <CD /> <MTN /> <Option144A /> <IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector></profile><profile><Form>Input</Form> <Issuer>another company</Issuer> <ProgrammeTitle>some value here</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> </profile></profilelist>'select v.p.value('./Form[1]','varchar(50)') AS Form,v.p.value('./Issuer[1]','varchar(50)') AS Issuer,v.p.value('./ProgrammeTitle[1]','varchar(50)') AS PgmTitle,v.p.value('./CombinedProgramme[1]','varchar(50)') AS CombinedPgm,v.p.value('./ECP[1]','varchar(50)') AS ECP,....from @XMLdoc.nodes('/profilelist/profile')v (p)output----------------------------------------------------Form Issuer PgmTitle CombinedPgm ECP---------------------------------------------------------------------Input top of the range company US$ 20,000,000,000 Global Commercial Paper Program No Input another company some value here No |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 04:29:37
|
| What if there is something like:<IssuerCountry> (see blow). There are two values.Thanks<profile><Form>Input</Form> <Issuer>top of the range company</Issuer> <ProgrammeTitle>US$ 20,000,000,000 Global Commercial Paper Programme `</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <CD /> <MTN /> <Option144A /> <IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector></profile><profile><Form>Input</Form> <Issuer>another company</Issuer> <ProgrammeTitle>some value here</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <IssuerSector> <value>Finance Vehicles</value> <value>Communications Media & Technology\Telecommunications</value> </IssuerSector></profile></profilelist> |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 04:56:38
|
you need the xml to be like this. there is / character inside the data which will break the xml, so you need to return them as character data inside CDATA sectionselect @XMLdoc ='<profilelist><profile><Form>Input</Form> <Issuer>top of the range company</Issuer> <ProgrammeTitle>US$ 20,000,000,000 Global Commercial Paper Programme `</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <CD /> <MTN /> <Option144A /> <IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector></profile><profile><Form>Input</Form> <Issuer>another company</Issuer> <ProgrammeTitle>some value here</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <IssuerSector><value>Finance Vehicles</value> <value><![CDATA[Communications Media & Technology\Telecommunications]]></value> </IssuerSector></profile></profilelist>'select v.p.value('./Form[1]','varchar(50)') AS Form,v.p.value('./Issuer[1]','varchar(50)') AS Issuer,v.p.value('./ProgrammeTitle[1]','varchar(50)') AS PgmTitle,v.p.value('./CombinedProgramme[1]','varchar(50)') AS CombinedPgm,v.p.value('./ECP[1]','varchar(50)') AS ECP,v.p.value('./IssuerSector[1]/value[1]','varchar(50)') AS val1,v.p.value('./IssuerSector[1]/value[2]','varchar(50)') AS val2from @XMLdoc.nodes('/profilelist/profile')v (p)output---------------------------------------------------------------Form Issuer PgmTitle CombinedPgm ECP val1 val2--------------------------------------------------------------------Input top of the range company US$ 20,000,000,000 Global Commercial Paper Program No Finance Vehicles super turboInput another company some value here No Finance Vehicles Communications Media & Technology\Telecommunicatio |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 05:02:52
|
| I tried it and did not have to use CDATAThe only issue seems to remain is that what if the number of values are not set.For example there may be two values in one and further down in the xml, there may be three values. The way it is right now as you suggested is to hard code the number of lines i.e. one per value.Is there another way to be flexible?Thanks |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 05:04:51
|
quote: Originally posted by arkiboys I tried it and did not have to use CDATAThe only issue seems to remain is that what if the number of values are not set.For example there may be two values in one and further down in the xml, there may be three values. The way it is right now as you suggested is to hard code the number of lines i.e. one per value.Is there another way to be flexible?Thanks
you mean the number of value nodes can vary dynamically? |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 05:12:39
|
| For example<IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector>may appear as<IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> <value>another value</value> </IssuerSector>OR EVEN<IssuerSector>I am here as just one value</IssuerSector> |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 06:52:12
|
| Any suggestions please? |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 06:56:17
|
quote: Originally posted by arkiboys Any suggestions please?
i think you might require to use FLOWR expressions here.refer below:-http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1309059,00.html |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 07:02:47
|
| There is no mention of FLOWR in that link. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 07:56:59
|
quote: Originally posted by arkiboys There is no mention of FLOWR in that link.
thats method used in it.see the below toohttp://www.stylusstudio.com/xquery_flwor.html |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 08:14:30
|
quote: Originally posted by visakh16 you need the xml to be like this. there is / character inside the data which will break the xml, so you need to return them as character data inside CDATA sectionselect @XMLdoc ='<profilelist><profile><Form>Input</Form> <Issuer>top of the range company</Issuer> <ProgrammeTitle>US$ 20,000,000,000 Global Commercial Paper Programme `</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <CD /> <MTN /> <Option144A /> <IssuerSector><value>Finance Vehicles</value> <value>super turbo</value> </IssuerSector></profile><profile><Form>Input</Form> <Issuer>another company</Issuer> <ProgrammeTitle>some value here</ProgrammeTitle> <CombinedProgramme>No</CombinedProgramme> <ECP /> <IssuerSector><value>Finance Vehicles</value> <value><![CDATA[Communications Media & Technology\Telecommunications]]></value> </IssuerSector></profile></profilelist>'select v.p.value('./Form[1]','varchar(50)') AS Form,v.p.value('./Issuer[1]','varchar(50)') AS Issuer,v.p.value('./ProgrammeTitle[1]','varchar(50)') AS PgmTitle,v.p.value('./CombinedProgramme[1]','varchar(50)') AS CombinedPgm,v.p.value('./ECP[1]','varchar(50)') AS ECP,v.p.value('./IssuerSector[1]/value[1]','varchar(50)') AS val1,v.p.value('./IssuerSector[1]/value[2]','varchar(50)') AS val2from @XMLdoc.nodes('/profilelist/profile')v (p)output---------------------------------------------------------------Form Issuer PgmTitle CombinedPgm ECP val1 val2--------------------------------------------------------------------Input top of the range company US$ 20,000,000,000 Global Commercial Paper Program No Finance Vehicles super turboInput another company some value here No Finance Vehicles Communications Media & Technology\Telecommunicatio
How is it possible to get the content of CDATA out.Such as:<ScannedOCR>- <![CDATA[ 1993 O/C ]]> <attachment>123.PDF</attachment> <attachment>456.pdf</attachment> </ScannedOCR> |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 08:29:00
|
| it doesnt have a node. just comes under root node ScannedOCR? |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 08:33:52
|
| Correct.I tried:v.p.value('./ScannedOCR[1]','varchar(50)') AS ScannedOCRIt does not return anything |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 08:38:09
|
| If there is:<ScannedOCR>- <![CDATA[ testValue ]]> <attachment>123.pdf</attachment> </ScannedOCR>I use:v.p.value('./ScannedOCR [1]','varchar(50)') AS ScannedOCR,v.p.value('./ScannedOCR [1]/value[1]','varchar(50)') AS ScannedOCRIt returns the attachment value BUT NOT the Test value |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 09:36:45
|
what does this return?select v.p.value('.','varchar(500)') from @XMLdoc.nodes('/ScannedOCR')v (p) |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 09:41:22
|
| I see.Is it possible to include this to the existing sql?select v.p.value('.','varchar(500)') from @XMLdoc.nodes('/ScannedOCR')v (p)to select v.p.value('./Form[1]','varchar(50)') AS Form,v.p.value('./Issuer[1]','varchar(50)') AS Issuer,v.p.value('./ProgrammeTitle[1]','varchar(50)') AS PgmTitle,v.p.value('./CombinedProgramme[1]','varchar(50)') AS CombinedPgm,v.p.value('./ECP[1]','varchar(50)') AS ECP,v.p.value('./IssuerSector[1]/value[1]','varchar(50)') AS val1,v.p.value('./IssuerSector[1]/value[2]','varchar(50)') AS val2from @XMLdoc.nodes('/profilelist/profile')v (p) |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-10-13 : 09:42:36
|
| but is ScannedOCR coming inside same xml? |
 |
|
|
arkiboys
Master Smack Fu Yak Hacker
1433 Posts |
Posted - 2008-10-13 : 10:31:29
|
| yes |
 |
|
|
Next Page
|