Please start any new threads on our new site at http://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

Our new SQL Server Forums are live! Come on over! We've restricted the ability to create new threads on these forums.

SQL Server Forums
Profile | Active Topics | Members | Search | Forum FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 T-SQL Statement to strip out HTML characters
 Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

ljp099
Yak Posting Veteran

79 Posts

Posted - 07/31/2007 :  16:16:48  Show Profile  Reply with Quote
I have a column which has some data stored with HTML markup. I want to strip out the HTML tags and peform a query like this:

SELECT D.Title from Documents D WHERE Or D.Title Like 'foo'"

So the what I need to do is use a sql function on the column D.Title so that it removes all HTML characters to perform the Like clause on foo.

Thanks for any help

Edited by - spirit1 on 07/31/2007 16:30:56

spirit1
Cybernetic Yak Master

Slovenia
11752 Posts

Posted - 07/31/2007 :  16:18:41  Show Profile  Visit spirit1's Homepage  Reply with Quote
you can use this and remove the tags with a regex
http://www.codeproject.com/cs/library/SqlRegEx.asp


or you can try to put your html in an xml type column and search it's values

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp

Edited by - spirit1 on 07/31/2007 16:19:19
Go to Top of Page

ljp099
Yak Posting Veteran

79 Posts

Posted - 07/31/2007 :  16:28:11  Show Profile  Reply with Quote
Oops. Maybe I posted to the wrong forum. This is using a SQL 2000 db. Can I use similar SQL RegEx in sql server 2000?

thanks for your help
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11752 Posts

Posted - 07/31/2007 :  16:30:47  Show Profile  Visit spirit1's Homepage  Reply with Quote
http://www.sqlteam.com/article/regular-expressions-in-t-sql

note that this is slow compared to any kind of T-SQL

also moved from Transact-SQL (2005) forum. Please post in the correct forum

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp

Edited by - spirit1 on 07/31/2007 16:31:31
Go to Top of Page

ljp099
Yak Posting Veteran

79 Posts

Posted - 07/31/2007 :  17:22:34  Show Profile  Reply with Quote
The function below answers my original question (is there a t-sql function to strip out html)


CREATE FUNCTION [dbo].[udf_StripHTML]
(@HTMLText VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT

SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1

WHILE @Start > 0 AND @End > 0 AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END

GO Test above function like this :

SELECT dbo.udf_StripHTML('<b>UDF at sqlteam.com </b><br><br><a href="http://www.sqlteam.com">sqlteam.com</a>'
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11752 Posts

Posted - 07/31/2007 :  17:27:15  Show Profile  Visit spirit1's Homepage  Reply with Quote
or you can do it like that
nice.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page
  Previous Topic Topic Next Topic  
 Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.03 seconds. Powered By: Snitz Forums 2000