Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 FOR XML Problems

Author  Topic 

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2008-07-22 : 08:00:35
Hi All,

I've written a script to search all the tables containing TEXT data in my database and perform a regular expression on them to find any html tags (using a udf calling 'VBScript.RegExp' to do the pattern matching).

It works great - I get my temp table with the databasename, schema, table, column, Id and text value fine.

However, I want to output this as a file so I can give this to our engineers who are trying to find out the impact of a proposed White-list change.

I though "FOR XML" Perfect! But it doesn't seem to be working as I expected.

I am dumping my temp table using the following...

[CODE]
SELECT * FROM #found FOR XML RAW
[/CODE]

This produces an XML file and most of the rows in my table are there.

However, it seems to have truncated some rows. I get broken XML statements (about 10 out of 500+ rows) that look something like this..

[CODE]
.......;/div>"/><row databaseNam
gt;
<li>Special leave.........
[CODE]

Where most of my XML statements in the resulting file are fine and look like this....

[CODE]
..../><row databaseName="[CS_Flex233]" tableName="[BenefitGroup]" columnName="[holidayText]" Id="36" value="<d.................
[/CODE]

The truncation doesn't happen at the same place every time.

I've set my Maximum Characters Received for XML Data at Unlimited.

Can anyone shed some light on this?

-------------
Charlie

VGuyz
Posting Yak Master

121 Posts

Posted - 2008-07-23 : 11:03:45
follow this link and search with google.

http://www.programmersheaven.com/2/SQL-server-2005-school-lesson-5
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2008-07-23 : 12:03:27
VGuys.

Thanks for your rather abrupt reply.

Unfortunately all the information on that link doesn't apply to me -- its all enhancements included in 2005. If you have some specific information that will help me then I'd be most grateful for it.


To Give a little more information....

I've found that, for some reason in 2000, If I'm selecting a TEXT data-type (that is a container for an HTML document, but probably for any sufficient size of text) for XML output I seem to get newlines inserted into my XML at some point..

For example:

I have a table

CREATE TABLE #found (
[DATABASE] NVARCHAR(255)
, [SCHEMA] NVARCHAR(255)
, [TABLE] NVARCHAR(255)
, [COLUMN] NVARCHAR(255)
, [ID VALUE] INT
, [TEXT] NTEXT
)
[/CODE]

Which is populated with data in the form...

[code]
<DATABASENAME>, dbo, <tableName>, <columnName>, <Id>, <A block of HTML text>


What's happening when I do a (with output to a file)

[CODE]
SELECT * FROM #found FOR XML <AUTO / RAW>
[/CODE]

Is that, at some point in the resulting file a newline is inserted (probably repeatedly after a set number of characters). Because SQL server is converting all of the HTML tags into acceptable XML (for instance <p> becomes &-l-t-;-p-&-g-t-;) (minus the -), if this newline is inserted in the middle of a tag (&l<NEWLINE>t;p>) then it fails parsing later on when the software engineers want to interrogate it.

Does anyone have any specific ideas?

Charlie.
Go to Top of Page
   

- Advertisement -