Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Stripping .rtf tags out

Author  Topic 

obezyanka
Starting Member

24 Posts

Posted - 2007-09-25 : 15:58:46
Does anyone know how to get rid of rtf tags that are stored in the table? I need to filter out the data and wondering if there is a utility on the SQL Server that can do it.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-09-25 : 16:02:03
Please post sample data of what you mean.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-25 : 16:03:30
CHARINDEX
PATINDEX
REPLACE
STUFF



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2007-09-26 : 01:58:53
Or, do like I do... beat the heck out of the vendor that provided the data in that format to begin with...

--Jeff Moden
Go to Top of Page

obezyanka
Starting Member

24 Posts

Posted - 2007-09-26 : 09:11:48
Tara, here's the sample:
'{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Tahoma;}}

{\colortbl ;\red0\green0\blue0;}

{\*\generator Riched20 5.50.99.2014;}\viewkind4\uc1\pard\cf1\f0\fs18\lang1033 ~200 LF streambank stabilization to provide structural protection\par

}
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-26 : 09:18:01
And you didn't read about the functions I posted earlier?

UPDATE Table1
SET Col1 = REPLACE(Col1, '{\colortbl ;\red0\green0\blue0;}', '')



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

obezyanka
Starting Member

24 Posts

Posted - 2007-09-26 : 09:19:53
Peso, thank you I did. I've never used these functions before.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-26 : 09:23:58
That's why I suggested them to you.

I think now is the time for you to READ the SQL Server help file called Books Online.
It has everytinhg you need to now to work with SQL Server. It will rarely tell you how to solve problems with it certainly tells you have to use the functions at hand.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 09:47:03
I think it will be a hard struggle indeed to remove the tags from RTF. Lots of them relate to content that needs to be inserted, rather than just to formatting.
Go to Top of Page

obezyanka
Starting Member

24 Posts

Posted - 2007-09-26 : 09:59:16
Kristen, you're right. What Peso posted didn't work because it stripped everything out not just .rtf tags.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-26 : 10:47:10
It was just an EXAMPLE!
You haven't told us exactly what you want to be done yet...



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-09-26 : 11:31:21
If you are using sql 2005, you can use a CLR function to leverage regular expressions to do the replacement, as demonstrated here:

http://weblogs.sqlteam.com/jeffs/archive/2007/04/27/SQL-2005-Regular-Expression-Replace.aspx

However, I do not recommend this approach for a beginner, that's for sure.

An alternative is to write a regular UDF that parses the string and removes the encoding. I am not sure 100% of RTF encoding, is it as simple as removing everything between { and } ? Does some stuff need to be replaced as well, to maintain some formatting? I.e., if {cr} means a line break, would it have to be replaced with char(10)+char(13) ?? What about normal { characters -- are they escaped somehow? Are they any existing libraries that will do this for us? (there is a rtf text box control, it may have a method to retreive the text only.) Things to think about ....

obezyanka -- before any code or function can be written to do what you ask, you need to carefully examine the data and come up with exact specs for what that function needs to do, taking into account the questions I just asked ... only when the spec is clear can the code be written.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-09-26 : 11:36:09
UPDATE: If you are using .NET, the RichTextBox class has a Text property that will return the text of a RTF string.

In .NET, you should be able to do something like this to convert a RTF string to a regular Text string:

RichTextBox rtb = new RichTextBox();
rtb.Rtf = // some value from your table
return rtb.Text; // just the text

You could write some code in .NET to loop through your table and update your data if you have that skill set in-house.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 12:00:32
"is it as simple as removing everything between { and }"

nah, afraid not, although that might do.

The { } can be nested, which makes the Find & Replace a bit more tricky, and any extended characters will use some RTF syntax, which would be find but it will be "optimised" to get included with a bunch of formatting stuff at the same time - so you might get something like:

{Choosefont/InsertCharacter/SetCharacterSpacing/... regular text}

which makes it hard not to miss the /InsertCharacter directive

For me RTF is a write-only language!

Kristen
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 12:01:04
Sorry, meant to say that your .NET RTF control looks promising.

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-26 : 12:17:24
See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=90034



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2007-09-26 : 19:50:22
Ummm... write to text file... import RTF to MS Word, export to text file, import to data base...???

--Jeff Moden
Go to Top of Page
   

- Advertisement -