Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
aiken
Aged Yak Warrior
525 Posts |
Posted - 2003-05-08 : 21:42:28
|
No, I haven't gone valley.I've currently got a user defined function that checks email subject lines to see if they're likely to be spam. It's very straightforward:create function f_spam (@vcSubject varchar(255))returns int ASif @vcSubject like '%your credit%' return 1if @vcSubject like '%viagra%' return 1if @vcSubject like '%hot teens%' return 1return 0 As you can imagine, that function has grown to be about 1000 lines, and it's time to do this right.What I'd like to do is have a table of spam phrases:create table spam_phrases ( phrase varchar(20) PRIMARY KEY CLUSTERED )insert into spam_phrases (phrase) VALUES ('your credit')insert into spam_phrases (phrase) VALUES ('viagra')insert into spam_phrases (phrase) VALUES ('hot teens')And then change my function, or the calling stored procedure, to see if any of the phrases are in the subject line. Is this possible without dynamic SQL? I worry that building a dynamic SQL WHERE clause using the phrases will exceed the 8000 character varchar limit, and I don't think you can execute a text variable.Can anyone see a way to do this? I need to match phrases of arbitrary word lengths, so splitting it into words and then doing word lookups doesn't work. I suppose I could test the whole subject, then every combination of phrases in the subject, but that's *really* ugly.Ideas?Thanks-b |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2003-05-08 : 21:47:00
|
| why not just:if exists (select 1 from spam_phrases where @vcSubject LIKE '%' + phrase + '%') return 1else return 0something like that ....- Jeff |
 |
|
|
aiken
Aged Yak Warrior
525 Posts |
Posted - 2003-05-09 : 19:47:23
|
| Er, yes, that would do it. Thanks! Didn't know you could do concatenation like that.Thanks-b |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2003-05-09 : 21:17:08
|
| BY the way --if you are testing a lot of different subject lines for a bunch of emails at once, it might be faster to do a join right between your two tables (using the LIKE as the operator) and do it all at once rather than using a UDF ... The SQL can probably optimize a join or a correlated subquery better than it can repeatedly calling a UDF.Something like:select email.from, email.to, email.subject, ... etc ... MAX(case when p.phrase is null then 0 else 1 end) as spamFROMemailLEFT OUTER JOINspam_phrases ponemail.subject LIKE '%' + p.phrase + '%'GROUP BY email.from, email.to, email.subject, ..etc ...or you could use the NOT EXISTS() in your a case expression or something like that ... this would be the way to go if you wish to tag or divert a bunch of email at once, instead of calling the UDF each time.but test out both ways.- JeffEdited by - jsmith8858 on 05/09/2003 21:17:31 |
 |
|
|
|
|
|
|
|