SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 Script Library
 Function: HTMLDecode
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

aiken
Aged Yak Warrior

USA
525 Posts

Posted - 10/15/2002 :  15:54:50  Show Profile  Send aiken an ICQ Message  Reply with Quote
This will take character data that's been HTML encoded and put it back to normal text. At present, it gets any and all encoding in the form &#abc; as well as some common HTML-named characters. It would be easy to expand to get any additional HTML names. It also replaces <P> with CRLF, which I needed but which would be easy to remove.

Cheers
-b

CREATE FUNCTION dbo.f_HTMLDecode (@vcWhat varchar(8000))
RETURNS varchar(8000) AS
BEGIN
DECLARE @vcResult varchar(8000)
DECLARE @vcCrLf varchar(2)
DECLARE @siPos smallint,@vcEncoded varchar(7),@siChar smallint

set @vcCrLF=char(13) + char(10)

select @vcResult=@vcWhat
select @siPos=PatIndex('%&#___;%',@vcResult)
WHILE @siPos>0
BEGIN
select @vcEncoded=substring(@vcResult,@siPos,6)
select @siChar=cast(substring(@vcEncoded,3,3) as smallint)
select @vcResult=replace(@vcResult,@vcEncoded,nchar(@siChar))
select @siPos=PatIndex('%&#___;%',@vcResult)
END

select @siPos=PatIndex('%&#____;%',@vcResult)
WHILE @siPos>0
BEGIN
select @vcEncoded=substring(@vcResult,@siPos,7)
select @siChar=cast(substring(@vcEncoded,3,4) as smallint)
select @vcResult=replace(@vcResult,@vcEncoded,nchar(@siChar))
select @siPos=PatIndex('%&#____;%',@vcResult)
END

select @vcResult=replace(@vcResult,'"','"')
select @vcResult=replace(@vcResult,'&','&')
select @vcResult=replace(@vcResult,'©','©')
select @vcResult=replace(@vcResult,'«','«')
select @vcResult=replace(@vcResult,'»','»')
select @vcResult=replace(@vcResult,'¼','¼')
select @vcResult=replace(@vcResult,'½','½')
select @vcResult=replace(@vcResult,'¿','¿')

select @vcResult=replace(@vcResult,'<P>',@vcCrLf)

return @vcResult
END


Arnold Fribble
Yak-finder General

United Kingdom
1961 Posts

Posted - 10/15/2002 :  16:33:25  Show Profile  Reply with Quote
quote:

select @vcResult=replace(@vcResult,'"','"')
select @vcResult=replace(@vcResult,'&','&')
select @vcResult=replace(@vcResult,'©','©')
select @vcResult=replace(@vcResult,'«','«')
select @vcResult=replace(@vcResult,'»','»')
select @vcResult=replace(@vcResult,'¼','¼')
select @vcResult=replace(@vcResult,'½','½')
select @vcResult=replace(@vcResult,'¿','¿')


I think that was supposed to say

select @vcResult=replace(@vcResult,'&quot;','"')
select @vcResult=replace(@vcResult,'&amp;','&')
select @vcResult=replace(@vcResult,'&copy;','©')
select @vcResult=replace(@vcResult,'&laquo;','«')
select @vcResult=replace(@vcResult,'&raquo;','»')
select @vcResult=replace(@vcResult,'&frac14;','¼')
select @vcResult=replace(@vcResult,'&frac12;','½')
select @vcResult=replace(@vcResult,'&iquest;','¿')


Go to Top of Page

SamC
White Water Yakist

USA
3464 Posts

Posted - 10/15/2002 :  17:43:34  Show Profile  Reply with Quote
I'd like to know what kind of problem this is solving?

My guess: Incoming HTML email is stored as plain text in DB?

What else?

Sam

Go to Top of Page

aiken
Aged Yak Warrior

USA
525 Posts

Posted - 10/29/2002 :  14:21:39  Show Profile  Send aiken an ICQ Message  Reply with Quote
Thanks, Arnold. Yes, that was what it was supposed to say.

The problem it's for is indeed to correct HTML that makes its way into the database. This can happen for a variety of reasons, but the most common one I see seems to be related non-english folks; probably do to different encoding or something, their entries in text fields often end up at least partially HTML-encoded (single quotes become &#XXXX; etc.

I've also used this to update an app that used to convert input to HTML before storing it in the db, which is an incredibly bad idea (what if you want to use that data in a non-browser environment?). Change the app to store raw data and HTML-ify it on presentation, use this function to update all of the old data, and voila.

Cheers
-b

Go to Top of Page

SamC
White Water Yakist

USA
3464 Posts

Posted - 11/18/2002 :  20:41:18  Show Profile  Reply with Quote
Hi Aiken,

I gave this function a try with

declare @test varchar (1000)
set @test= '<p>Testing > " .</p>'

Print @test
Print dbo.f_HTMLDecode (@test)

The function didn't convert anything at all. Did I do something wrong or is there a bug?

Sam

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 09/18/2007 :  08:33:03  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Also see http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=88926



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

danp129
Starting Member

1 Posts

Posted - 11/21/2008 :  17:57:09  Show Profile  Reply with Quote
Had to change nchar(@siChar) to char(@siChar) for #153; (™) and othersto work, thanks for sharing this though, it was a time saver.

You can run this to compare the difference between nchar() and char():

print dbo.f_HTMLDecode('
128 € = #128;
129  = #129;
130 ‚ = #130;
131 ƒ = #131;
132 „ = #132;
133 … = #133;
134 † = #134;
135 ‡ = #135;
136 ˆ = #136;
137 ‰ = #137;
138 Š = #138;
139 ‹ = #139;
140 Π= #140;
141  = #141;
142 Ž = #142;
143  = #143;
144  = #144;
145 ‘ = #145;
146 ’ = #146;
147 “ = #147;
148 ” = #148;
149 • = #149;
150 – = #150;
151 — = #151;
152 ˜ = #152;
153 ™ = #153;
154 š = #154;
155 › = #155;
156 œ = #156;
157  = #157;
158 ž = #158;
159 Ÿ = #159;
160   = #160;
161 ¡ = #161;
162 ¢ = #162;
163 £ = #163;
164 ¤ = #164;
165 ¥ = #165;
166 ¦ = #166;
167 § = #167;
168 ¨ = #168;
169 © = #169;
170 ª = #170;
171 « = #171;
172 ¬ = #172;
173 ­ = #173;
174 ® = #174;
175 ¯ = #175;
176 ° = #176;
177 ± = #177;
178 ² = #178;
179 ³ = #179;
180 ´ = #180;
181 µ = #181;
182 ¶ = #182;
183 · = #183;
184 ¸ = #184;
185 ¹ = #185;
186 º = #186;
187 » = #187;
188 ¼ = #188;
189 ½ = #189;
190 ¾ = #190;
191 ¿ = #191;
192 À = #192;
193 Á = #193;
194 Â = #194;
195 Ã = #195;
196 Ä = #196;
197 Å = #197;
198 Æ = #198;
199 Ç = #199;
200 È = #200;
201 É = #201;
202 Ê = #202;
203 Ë = #203;
204 Ì = #204;
205 Í = #205;
206 Î = #206;
207 Ï = #207;
208 Ð = #208;
209 Ñ = #209;
210 Ò = #210;
211 Ó = #211;
212 Ô = #212;
213 Õ = #213;
214 Ö = #214;
215 × = #215;
216 Ø = #216;
217 Ù = #217;
218 Ú = #218;
219 Û = #219;
220 Ü = #220;
221 Ý = #221;
222 Þ = #222;
223 ß = #223;
224 à = #224;
225 á = #225;
226 â = #226;
227 ã = #227;
228 ä = #228;
229 å = #229;
230 æ = #230;
231 ç = #231;
232 è = #232;
233 é = #233;
234 ê = #234;
235 ë = #235;
236 ì = #236;
237 í = #237;
238 î = #238;
239 ï = #239;
240 ð = #240;
241 ñ = #241;
242 ò = #242;
243 ó = #243;
244 ô = #244;
245 õ = #245;
246 ö = #246;
247 ÷ = #247;
248 ø = #248;
249 ù = #249;
250 ú = #250;
251 û = #251;
252 ü = #252;
253 ý = #253;
254 þ = #254;
255 ÿ = #255;
')
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000