Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2006-05-17 : 09:48:23
|
MGD writes "We have taken a straight copy of an OLTP database which has a web front end and we use the copy database for reporting. We have noticed that some of the fields of type long text include a load of html tags ( all manner of tags eg <>). What is the most efficient way of removing the html tags out of a long text field.Usually reporting is summary info but in this particular case it is necessary to report on the detail - dull as dishwater reports but essential for operations. How on earth do you remove all the unwanted characters which might appear anywhere in the text which might be over 10000 characters long?!" |
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2006-05-18 : 02:28:16
|
http://www.nigelrivett.net/SQLTsql/ReplaceText.htmlMadhivananFailing to plan is Planning to fail |
|
|
RichT
Starting Member
1 Post |
Posted - 2006-06-08 : 11:31:10
|
How about a function:/*R.Tolliver 6/7/2006Function to strip HTML from Text*/ALTER FUNCTION [dbo].[fn_StripHTML] (@strHTML varchar(300)) RETURNS VarChar(300) AS /*STRIP OUT HTML TAGSTRY AND LEAVE GREATER THAN > and LESS THEN tags aloneassums Greater than and less than will be preceeded byand followed by a space, and that HTML tags will nothave a space immediatly following the <*/BEGIN DECLARE @iTagStart INTDECLARE @iTagEnd INTDECLARE @iLT INTSET @iTagStart = 0WHILE (Charindex('<',@strHTML)) > 0 AND (Charindex('>',@strHTML)) > 0 BEGIN SET @iTagStart = Charindex('<',@strHTML,@iTagStart) IF @iTagStart = 0 BREAK --No tags SET @iLT = Charindex(' < ',@strHTML,@iTagStart) If (@iLT + 1 = @iTagStart) AND (@iLT > 0) --It's a < symbol, not the start of a tag BEGIN SET @iTagStart = @iTagStart + 1 CONTINUE END SET @iTagEnd = Charindex('>',@strHTML,@iTagStart + 1 ) IF @iTagEnd = 0 BREAK --No end tage, get out IF SUBSTRING(@strHTML,@iTagStart + 1, 1) <> ' ' AND (Charindex('>',@strHTML)) > 0 BEGIN SET @strHTML = LEFT(@strHTML,@iTagStart -1) + SUBSTRING ( @strHTML,@iTagEnd + 1, 300 ) END ELSE BEGIN SET @iTagStart = @iTagStart + 1 SET @iTagStart = Charindex('<',@strHTML,@iTagStart) - 1 END IF Charindex('<',@strHTML) = 0 BREAK ELSE CONTINUE END RETURN @strHTML END--test --SELECT dbo.fn_StripHTML('<Font color="Red"><B> Test if 5 > 6 or 3 < 2 or 7 > 8 > And <SPAN> annother tag</Span></B></Font>') |
|
|
|
|
|
|
|