| Author |
Topic |
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 11:18:10
|
| Hi,I am working on a translation database and trying to find the occurences of row matches in a table.First of all I need to get the translatorID,date,score,srcLang,tgtLang and subject matches so i can compare later :-------------------------------------------------------------------SELECT TranslatorID,date,score,SrcLangCode,TgtLangCode,Subject FROM reportGROUP BY TranslatorID,date,score,SrcLangCode,TgtLangCode,Subject-------------------------------------------------------------------I get some results like2752 2009-04-17 100 enu tur mar2752 2009-06-01 92 enu abc std2752 2009-06-01 96 enu tur std2752 2009-06-17 100 enu tur marBut now I need to check these results and compare them. So, for example i need to count if there were 2 enu tur mar results that had a score of more than 95 in the last 12 months. There are a number of conditions i need to check.Then i need to checkenu abc stdenu tur stdenu tur maretc. and compare their dates and scores.What is the best approach for running conditions on this result set ? The result set is 60k rows. I was thinking of storing the results in a table variable and looping but it could be slow with so many results and conditions to check.Thank you so much for any tips. I hope i have been clear |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-11-17 : 11:21:37
|
| you need count multiple occurance of same SrcLangCode,TgtLangCode,Subject within dataset and return /exclude them? if yes, it can be done in set based logic, no need of using cursors or loop------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 11:30:17
|
| Thank you for the reply.Yes, I need to count how many translators have translated in say English-> German and Subject: Finance.Then, i need to check how many times each translator has done work in this language pair and subject and whether their score for the work was more than 90 for every time and each time happened in the last 12 months.I hope i am being clear ! |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-11-17 : 11:39:23
|
quote: Originally posted by grimmus Thank you for the reply.Yes, I need to count how many translators have translated in say English-> German and Subject: Finance.Then, i need to check how many times each translator has done work in this language pair and subject and whether their score for the work was more than 90 for every time and each time happened in the last 12 months.I hope i am being clear !
ok thats possiblecan you give which of columns hold these (not fully clear from earlier post)------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 11:39:31
|
| Would you mind providing a simple example of set based logic based on my original select statement above ?Thanks ! |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 11:43:10
|
OK, sure i can provide column namesTranslatorIDSrgLangTgtLangSubjectScoreDateSrcLang,TgtLang,Subject will be the matches. So, if there's x rows of English->German->Finance then i need to check if these x rows fall in the last 12 months and all have a score of 90 or more.I'm trying to be as clear as possible but my head is melted from trying to think the logic for this !Thanks |
 |
|
|
Jeffreys
Starting Member
45 Posts |
Posted - 2011-11-17 : 12:00:59
|
| If I understand you correctly, the best way that I've found to do such a thing is by using row_number() OVER (partition by {unique constraint} ORDER By {?})In your case it would be as follows:SELECT rowN = row_number() over (partition by TranslatorCode,LQIDate,LQX,SrcLangCode,TgtLangCode,Specialization order by TranslatorCode) ,TranslatorID ,date ,score ,SrcLangCode ,TgtLangCode ,Subject FROM report |
 |
|
|
Jeffreys
Starting Member
45 Posts |
Posted - 2011-11-17 : 12:04:21
|
| Then if you wanted to see only a specific number of records in the resultset such as the top 10 of each unique condition then make it a derived table and wrap it as follows:select *FROM ( ... query from my last post)dt1WHERE rowN <= 10 |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-11-17 : 12:13:03
|
quote: Originally posted by grimmus OK, sure i can provide column namesTranslatorIDSrgLangTgtLangSubjectScoreDateSrcLang,TgtLang,Subject will be the matches. So, if there's x rows of English->German->Finance then i need to check if these x rows fall in the last 12 months and all have a score of 90 or more.I'm trying to be as clear as possible but my head is melted from trying to think the logic for this !Thanks 
SELECT *FROM(SELECT COUNT(1) OVER (PARTITION BY SrgLang,TgtLang,Subject) AS Cnt1,SELECT COUNT(CASE WHEN Date BETWEEN DATEADD(mm,DATEDIFF(mm,0,GETDATE())-12,0) ANDDATEADD(mm,DATEDIFF(mm,0,GETDATE())-1,0) AND score >=90 THEN 1 ELSE NULL END) OVER (PARTITION BY SrgLang,TgtLang,Subject) AS Cnt2,*FROM Table)tWHERE Cnt1 = Cnt2 ------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 12:25:08
|
| Thanks, you guys rock !I seem to be getting an error near the 3rd select statement.Trying to figure it out but if you notice anything i'd appreciate letting me know |
 |
|
|
Jeffreys
Starting Member
45 Posts |
Posted - 2011-11-17 : 12:35:41
|
| ... You could use this as the basis for your condition comparisons. Either directly through inline views (derived tables) in a single statement or selecting this into a temp table and running multiple set based queries (for your conditions) against it.I agree with visakh16I'd go with the single statement using something like joining all the sub condition aggregates on the rowN so that you have all aggregates rolled up on the record level (depending on how many you have of course). Even if it was a dozen or so it wouldn't be sucn a chore with only 60K records in the base table.That is, if I understand you correctly that the initial unique condition you gave was the master unique condition and all of the other condition checks are subordinate and within this grouping.Hope I made sense.Jeff |
 |
|
|
Jeffreys
Starting Member
45 Posts |
Posted - 2011-11-17 : 12:45:59
|
| What's the error?Also, if all your sub condition aggregates are subordinate to your first unique constraint set, you could also consider using joins of the sub aggregates, joining on the Rown value. just a thought. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-11-17 : 12:46:46
|
quote: Originally posted by grimmus Thanks, you guys rock !I seem to be getting an error near the 3rd select statement.Trying to figure it out but if you notice anything i'd appreciate letting me know
ops that was a copy paste typoSELECT *FROM(SELECT COUNT(1) OVER (PARTITION BY SrgLang,TgtLang,Subject) AS Cnt1,COUNT(CASE WHEN Date BETWEEN DATEADD(mm,DATEDIFF(mm,0,GETDATE())-12,0) ANDDATEADD(mm,DATEDIFF(mm,0,GETDATE())-1,0) AND score >=90 THEN 1 ELSE NULL END) OVER (PARTITION BY SrgLang,TgtLang,Subject) AS Cnt2,*FROM Table)tWHERE Cnt1 = Cnt2 ------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 12:53:25
|
| This is the SQL i have from visakh16SELECT * FROM(SELECT COUNT(1) OVER (PARTITION BY SrgLangCode,TgtLangCode,Specialization) AS Cnt1,SELECT COUNT(CASE WHEN date BETWEEN DATEADD(mm,DATEDIFF(mm,0,GETDATE())-12,0) ANDDATEADD(mm,DATEDIFF(mm,0,GETDATE())-1,0) AND score >=90 THEN 1 ELSE NULL END) OVER (PARTITION BY SrgLangCode, TgtLangCode, Specialization) AS Cnt2,*FROM report)tWHERE Cnt1 = Cnt2I'm getting the following errorsMsg 156, Level 15, State 1, Line 5Incorrect syntax near the keyword 'SELECT'.Msg 102, Level 15, State 1, Line 10Incorrect syntax near ')'. |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 12:59:03
|
| Thank you visakh16, the new SQL you added is working better |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-11-17 : 13:05:20
|
| welcome------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 13:10:46
|
| okay, for this query i'm only getting 4 results returned :1 1 2011-06-02 93.000000 enu tgl mar1 1 2011-08-17 98.000000 ENU ARA XXX2 2 2011-06-29 95.000000 enu ara erp2 2 2011-06-28 100.000000 enu ara erpI'm a bit unsure what Cnt1 and Cnt2 are doing. Also, the last 2 rows have the same translatorID (not displayed above) and the same language pair/subject.Please excuse my knowledge of SQL ! This is advanced stuff for me. |
 |
|
|
Jeffreys
Starting Member
45 Posts |
Posted - 2011-11-17 : 13:12:37
|
| the other route would be something like...SELECT * FROM (SELECT rowN = row_number() over (partition by TranslatorCode,LQIDate,LQX,SrcLangCode,TgtLangCode,Specialization order by TranslatorCode),TranslatorID,date,score,SrcLangCode,TgtLangCode,Subject,grp1Cnt = grp1.cnt....,grpNcnt = grpn.cntFROM report rptINNER JOIN ( SELECT cnt = count(*) FROM report WHERE [Date] BETWEEN DATEADD(mm,DATEDIFF(mm,0,GETDATE())-12,0) AND DATEADD(mm,DATEDIFF(mm,0,GETDATE())-1,0) AND score >=90 GROUP BY SrgLang,TgtLang,[Subject])grp1 ON grp1.rown = rpt.rown INNER JOIN {condition 2 aggregate}.....INNER JOIN {condition n aggregate})dt1 WHERE rown = 1 |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 13:15:00
|
| Thank you Jeffreys, ill give that a shot now. |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 13:27:45
|
| I'm trying your technique Jeffreys but i getMsg 207, Level 16, State 1, Line 21Invalid column name 'rowN'.Msg 207, Level 16, State 1, Line 21Invalid column name 'rowN'.I guess it's with this line ON grp1.rowN = rpt.rowNis there a rowN associated with grp1 ?Thanks |
 |
|
|
grimmus
Yak Posting Veteran
53 Posts |
Posted - 2011-11-17 : 13:31:24
|
| visakh16, I havent given up on your suggestion !! I'm just trying to work out why the resultset is so small |
 |
|
|
Next Page
|