Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
coolerbob
Aged Yak Warrior
841 Posts |
Posted - 2007-04-30 : 08:57:23
|
| I wrote this function:(It takes a string and sums the Scrabble letter values)CREATE function dbo.fc_GetWordHashScore(@strWords text)returns intASbegindeclare @lngReturn As intdeclare @strLetter As varchar(8000)declare @lngValue As intdeclare @cnt int set @lngReturn = 0set @cnt=0 while @cnt<=datalength(@strWords) begin set @lngValue = 0 set @strLetter = substring(@strWords, @cnt, 1) if @strLetter='E' or @strLetter= 'A' or @strLetter= 'I'or @strLetter='O'or @strLetter= 'N'or @strLetter= 'R'or @strLetter= 'T'or @strLetter= 'L'or @strLetter= 'S'or @strLetter= 'U' set @lngValue = 1 if @strLetter='D' or @strLetter= 'G' set @lngValue = 2 if @strLetter='B' or @strLetter= 'C' or @strLetter='M' or @strLetter='P' set @lngValue = 3 if @strLetter='F' or @strLetter='H' or @strLetter='V' or @strLetter='W' or @strLetter='Y' set @lngValue = 4 if @strLetter='K' set @lngValue = 5 if @strLetter='J' or @strLetter='X' set @lngValue = 8 if @strLetter='Q' or @strLetter='Z' set @lngValue = 10 set @lngReturn = @lngReturn + @lngValue set @cnt=@cnt+1 end return @lngReturnendGOIt means I can scan hundreds of thousands of records and check a text column for being similar to a text file I am comparing it too. This can be done by checking for records that have a summed Scrabble score 100 less than or greater than the Scrabble score of my text file.I want to put it in a computed & (here's the thing) persisted column.Trouble is, it's obviously not deterministic.Any suggestions on how I can conquer the "deterministic" hurdle? |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-04-30 : 09:57:38
|
| What is this algorithm being used for?You really want:eeeeeeeeeeto be considered the same or similar to:z?? Both have a "scrabble score" of 10.But, anyway, this function is deterministic. (same input = same output).- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
|
coolerbob
Aged Yak Warrior
841 Posts |
Posted - 2007-04-30 : 10:03:22
|
| Yes it doesn't matter that 10 e's equal 1 z. It doesn't have to be perfectly accurate.I know it's logically deterministic. But in isn't deterministic from the SQL Engine's point of view:Server: Msg 4936, Level 16, State 1, Line 1Computed column 'WordHash' in table 'Doc' cannot be persisted because the column is non-deterministic.Am I breaking either of these requirements? (found in SQL 2000 documentation - I presume it's still a requirement in 2005):The function is schema-boundThe body of the function references no database objects outside the scope of the function |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-04-30 : 10:24:36
|
| Since you are using SQL 2005, just add WITH SCHEMA BINDING to your function definition. Then, you mark your computed column as PERSISTED and you should be good to go.- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
|
coolerbob
Aged Yak Warrior
841 Posts |
Posted - 2007-04-30 : 10:59:54
|
| Thanks Jeff. You're the man. [Sorry about my MS Access Rants. Was having a bad day. :-) ] |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-04-30 : 11:59:06
|
| [code]CREATE FUNCTION dbo.fnGetWordHashScore( @Word VARCHAR(8000))RETURNS INTASBEGIN DECLARE @Sum AS INT, @Char CHAR(1), @Index INT SELECT @Sum = 0, @Index = 0 WHILE @Index <= LEN(@Word) SELECT @Index = @Index + 1, @Char = SUBSTRING(@Word, @Index, 1), @Sum = CASE WHEN @Char IN ('E', 'A', 'I', 'O', 'N', 'R', 'T', 'L', 'S', 'U') THEN 1 WHEN @Char IN ('D', 'G' ) THEN 2 WHEN @Char IN ('B', 'C', 'M', 'P') THEN 3 WHEN @Char IN ('F', 'H', 'V', 'W', 'Y') THEN 4 WHEN @Char IN ('K') THEN 5 WHEN @Char IN ('J', 'X') THEN 8 WHEN @Char IN ('Q', 'Z') THEN 10 ELSE 0 END + @Sum RETURN @SumEND[/code]Peter LarssonHelsingborg, Sweden |
 |
|
|
coolerbob
Aged Yak Warrior
841 Posts |
Posted - 2007-05-01 : 03:48:13
|
| Very tidy Pete, thanks.Problem I have now is that my short script to add the column has been executing for the past 17 hours!:BEGIN TRANSACTIONSET QUOTED_IDENTIFIER ONSET ARITHABORT ONSET NUMERIC_ROUNDABORT OFFSET CONCAT_NULL_YIELDS_NULL ONSET ANSI_NULLS ONSET ANSI_PADDING ONSET ANSI_WARNINGS ONCOMMITBEGIN TRANSACTIONGOALTER TABLE dbo.[Doc] ADD WordHash AS dbo.fc_GetWordHashScore(DocText) PERSISTED GOCOMMITThere are 775 000 records with an average of two printed pages worth of text in the DocText column.Normally I would batch an operation this big. Can you batch the population of a persisted computed column?Could I maybe populate a standard column in bathces first and then turn that column into a persisted computed column? |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-05-01 : 08:50:29
|
| If that's the case, you may want to create a regular column, populate it in batches, but then use triggers to populate the column going forward, instead of a computed column.- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
|
coolerbob
Aged Yak Warrior
841 Posts |
Posted - 2007-05-01 : 09:26:46
|
| Well it completed after 20 hours, so that's a relief.On the production 64-bit server it should be faster.I think the computed column is more scaleable than the trigger approach. Am I right? Or is there no clear difference?Also, what happens if I need to change the logic in the function? Because it's schema-bound, will I have to go through the whole 20 hour population process again? |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-12-21 : 07:17:26
|
This calculates the Scrabble Score for the 60,387 words in dbo.WordList table in less than 2 seconds.CREATE TABLE #Lookup ( Letter CHAR(1) PRIMARY KEY CLUSTERED, Items TINYINT NOT NULL, Points SMALLINT NOT NULL )INSERT #Lookup ( Letter, Items, Points )VALUES ('A', 9, 1), ('B', 2, 3), ('C', 2, 3), ('D', 4, 2), ('E', 12, 1), ('F', 2, 4), ('G', 3, 2), ('H', 2, 4), ('I', 9, 1), ('J', 1, 8), ('K', 1, 5), ('L', 4, 1), ('M', 2, 3), ('N', 6, 1), ('O', 8, 1), ('P', 2, 3), ('Q', 1, 10), ('R', 6, 1), ('S', 4, 1), ('T', 6, 1), ('U', 4, 1), ('V', 2, 4), ('W', 2, 4), ('X', 1, 8), ('Y', 2, 4), ('Z', 1, 10)SELECT y.Word, SUM(l.Points * CASE WHEN y.Items <= l.Items THEN y.Items ELSE l.Items END) AS ScoreFROM ( SELECT w.Word, SUBSTRING(w.Word, v.Number, 1) AS Letter, COUNT(*) AS Items FROM dbo.WordList AS w INNER JOIN master..spt_values AS v ON v.Type = 'P' WHERE v.Number BETWEEN 1 AND LEN(w.Word) GROUP BY w.Word, SUBSTRING(w.Word, v.Number, 1) ) AS yINNER JOIN #Lookup AS l ON l.Letter = y.LetterGROUP BY y.WordHAVING SUM(CASE WHEN y.Items > l.Items THEN y.Items - l.Items ELSE 0 END) <= 2ORDER BY SUM(l.Points * CASE WHEN y.Items <= l.Items THEN y.Items ELSE l.Items END) DESC, y.WordDROP TABLE #Lookup N 56°04'39.26"E 12°55'05.63" |
 |
|
|
|
|
|
|
|