| Author |
Topic |
|
phenreid
Starting Member
29 Posts |
Posted - 2010-11-18 : 21:38:58
|
| [code]select t.tranid, t.LName, t.L4ssn from TestAdds t,(select Lname, L4SSN from udf_AM_vw('Passed Value')) A where t.lname=a.lname and t.L4ssn=a.l4ssn and LEN(t.l4ssn)=4 --group by t.tranid, t.LName, t.L4ssn[/code]t.tranid = PK; index on all fields on both sides of the implied join.Returns 980 records.Runs in 1 second without the group by. With the group by runs forever... after a five minutes I aborted.I tried the equivalent of above with JOIN -- same result. If I do a make table query with temp table, then the group by works.I tried making the whole thing a DT and then selecting/ grouping on that -- same problem; runs forever.I want to understand not only the solution but what is going on here that would cause a group by to behave this way. I feel like there is a fundamental concept I'm missing.You might ask what the UDF is doing, but the question I meant to ask is why adding the "group by" would cause it to hang -- clearly the UDF isn't the problem since without the group by it runs instantly.Thanks! |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
phenreid
Starting Member
29 Posts |
Posted - 2010-11-19 : 13:06:54
|
| I tried both distinct and group by originally; no difference. (I thought I read they were equivalent and that it was best to avoid the "distinct" word when you could accomplish same with "group by".)udf_AM_vw creates a many-to-many relation. I solved my immediate problem by making a new UDF that gives me a one-to-one match, but I want to understand how I could have accomplished this. I want SQL to first come up with my 980 records THEN aggregate them or show me DISTINCT. If I do this in two steps with a temp table, it's instant. But somehow it insists on doing something more complex no matter how I group the SQL statement. What is the trick/syntax to making it work like two separate steps?I would be happy to post the execution plan but I can't post a graphic on this forum and you probably don't want me to paste the giant XML -- what is the etiquette/method here for posting the execution plan. I would love to learn how to read those properly. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
phenreid
Starting Member
29 Posts |
Posted - 2010-11-19 : 13:55:57
|
|
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
phenreid
Starting Member
29 Posts |
Posted - 2010-11-19 : 14:45:50
|
| Unfortunately, I can't post the UDF on a public forum. It is a select on a view that is PK-to-PK of two large tables and uses the DecryptByPassPhrase function on a number of fields. That function has not caused me any problems or issues up to now. As I mentioned -- without the DISTINCT keyword, select returns 980 records very quickly. Add the DISTINCT keyword and you get the exec plan above.So, it sounds like there is no "silver bullet" way to phrase SQL statements so they will always SELECT the 980 records first THEN apply a DISTINCT to them. I assume the optimizer does its own thing and a UDF table could throw the optimizer off -- so you just have to look at each situation independently. Or, if this is one-time SQL work -- if DISTINCT slow, just use temp tables because for non-experts it can take awhile to find the cause/solution. And, if it is something recurring, then you have to dig into it and find a faster way.Is that basically how the experts handle it?I'm just surprised I couldn't phrase the statement so that it returns the 980 records first (as a DT) then: Select Distinct field1, field2 from (DT returning 980 records). It doesn't seem to care about the phrasing as either way I end up with the clustered index scan above. Only with two completely separate passes with temp tables I don't. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
|