Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Jeff Smith's Custom Aggregate functions revisited

Author  Topic 

jbshirk
Starting Member

1 Post

Posted - 2008-06-12 : 11:28:55
I read with great interest Jeff's outstanding article on non-deterministic UDF's for custom aggregate functions:

[url]http://weblogs.sqlteam.com/jeffs/articles/1490.aspx[/url]

This gave me hope that there is a way to do some major data cleanup using a custom aggregate function for finding the "intersection" of a group of strings. following Jeff's example:


declare @a int;
set @a = dbo.MakeObj()

select reordernum, MIN(dbo.StringIntersection(@a, replace(Description,ColorDescrip,''))), dbo.LastResult(@a)
from Item
group by reordernum


Item table sample data:
Lookup Vendor ReorderNum ColorCd ItemDescription ColorDescrip
083JD038LAYBLK 083 JD038LAY BLK W LAYLA DRESS BLACK BLACK
083JD038LAYOCN 083 JD038LAY OCN W LAYLA DRESS BLUE OCN
083JD03BKELTVB 083 JD03BKEL TVB W KELSEY HOODIE BLUE INDIGO
083JD03BMILFLM 083 JD03BMIL FLM W MILLA DRESS PINK RED
083JD03BMILPEG 083 JD03BMIL PEG W MILLA DRESS GREEN GREEN
083JD046CASINH 083 JD046CAS INH W CASSIA DRESS INDIGO INDIGO
083JD04ALILWHT 083 JD04ALIL WHT W LILLY DRESS WHT WHITE
083JD058ROSBLK 083 JD058ROS BLK W ROSIE DRESS BLACK BLACK
083JD05CLOTNTT 083 JD05CLOT NTT W LOTUS DRSS BLUE BLUE
083JD06BWILIND 083 JD06BWIL IND W WILMINGTON DRESS BLUE INDIGO
083JD06BWILTVB 083 JD06BWIL TVB W WILMINGTON DRESS BLUE TEAL
083JD077LALBLK 083 JD077LAL BLK W LALLA DRESS BLACK BLACK
083JD077LALERT 083 JD077LAL ERT W LALLA DRESS EARTH EARTH
083JD077LALFIG 083 JD077LAL FIG W LALLA DRESS FIG PURPLE FIGI BLUE
083JD077LALOCN 083 JD077LAL OCN W LALLA DRESS OCEAN OCN
083JD077LALORC 083 JD077LAL ORC W LALLA DRESS PINK ORCHID


The task is to consolidate these 18 thousand+ items to a more generalized ItemDescription by ReorderNum. Essentially the Color needs to be removed, but using replace() alone won't do because many colors in the description don't match ColorCd or ColorDescrip.

Sample result:
JD038LAY W LAYLA DRESS
JD03BKEL W KELSEY HOODIE BLUE
JD03BMIL W MILLA DRESS
JD046CAS W CASSIA DRESS
JD04ALIL W LILLY DRESS
JD058ROS W ROSIE DRESS
JD05CLOT W LOTUS DRSS
JD06BWIL W WILMINGTON DRESS
JD077LAL W LALLA DRESS

The failure in removing the color from W KELSEY HOODIE BLUE because ColorDescrp doesn't match and JD03BKL is unique is acceptable

I see that after doing Jeff's MakeObj() the secret is setting up the RegExp Pattern in a function like Jeff's Concatenate() UDF.

However I'm not sure I grasp how the Pattern and the SET @result should be done to accomplish a string intersection.

Can some smart guru offer some advice on how to get started?
   

- Advertisement -