Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 query optimization having 10 million rows

Author  Topic 

nextaxtion
Yak Posting Veteran

54 Posts

Posted - 2015-02-27 : 13:47:35
i have a following table

table name : emp_master

empid efname emname elamane efathername emothername deptno edob edoj createdby updateby lastupdatedatetime lastactionperformed


empid is primarykey.

this table contains 20million of records and i want to fire following query on this to get employye all data where eployee is more than 10 year old


select empid ,efname, emname, elamane, efathername, emothername, deptno ,edob ,edoj ,createdby, updateby, lastupdatedatetime ,lastactionperformed
from emp_master
where year(doj)+10 > year(getadate())


this will return approx 10 million rows and taking 18 mins.

Please help me tune this query what approaches should i take to reduce the time of execution.





prithvi nath pandey

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2015-02-27 : 13:49:19
What are you going to do with 10 million rows returned? You can't expect a query to be fast when it returns that much data.

What is this: year(doj)? Is it a function? Are you using Microsoft SQL Server? I'm guessing no, since that syntax isn't valid and would need to include the schema.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

nextaxtion
Yak Posting Veteran

54 Posts

Posted - 2015-02-27 : 14:32:43
hi tara ,

ur guess is wrong.
year is system function and return year of provided date, here i am trying to get year of edoj column that contain date of joining of employee.

I am using sql server 2012 and the function year is all valid in that please check yourself.
The query will take more time but i need to tune it so that i will take less time as compare to 18 mins.

Please check.

prithvi nath pandey
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2015-02-27 : 15:34:31
D'oh! I know year is a system function, but for some reason I read it as a different word. I'm used to seeing system functions in upper case in code and just read your code wrong (and then I continued it with the copy/paste).

To answer your question, make sure doj is indexed and then use this instead: WHERE doj <= DATEADD(yy,-10,GETDATE()). The key is isolating doj so that it doesn't have functions that way an index can be used. If doj is not indexed, you need to add one.


Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

nextaxtion
Yak Posting Veteran

54 Posts

Posted - 2015-02-28 : 14:33:58
thanks for reply.

this was asked in some interview and i replied that there should be index on doj column but interview told me that there is still some scope to improve the performance.

Request you to please provide me ur outlook regarding this.

Also please suggest me from where should i start to learn optimization and tuning techniques.request you to please provide me some resource regarding this.



prithvi nath pandey
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2015-02-28 : 15:00:59
Tara already told you the answer. Free the DOJ column from all calculations so that the index can be used.


Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2015-02-28 : 17:03:07
Even with the index and proper WHERE condition, the amount of data returned (10 million) vs. the amount in the table (20 million) will probably push the query optimizer to perform a table or clustered index scan. As Tara stated, you cannot expect such a query to return all results quickly. By my estimation, each row will be a minimum of 150 bytes, x 10 million rows = 1.5 GB of data, and will probably be more. That's way too much data to return to a client efficiently. The only scenario where this makes sense is exporting data to another system, via an ETL process or to a file for similar processing.
Go to Top of Page
   

- Advertisement -