Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
hummer123
Starting Member
5 Posts |
Posted - 2013-04-02 : 02:17:17
|
Hi,I am trying to understand a issue we have with a large database and a query that we need to perform. I am not a database expert as such but know my way around a little ;-)Server. 24 core CPU with 64GB memory and SAS drives.The database has 2 tables.Table 1 – domains_com contains a list of domains with approx. 160 million rowsColumns:DomainidDomainameLockedIgnoredThis table already has an index and a query on any column is instant.Table 2This table contains multiple results of nameserver and whois data for the domains.Columns[domainid][registrar][whois_server][registrar_homepage][date_created][date_expiring][date_updated][nameservers][domain_status][domain_ip][temp_whois_result][last_checked][updated][app_name][crawl_ip][info_registrant][info_admin][info_tech][info_billing]The problem I am having is with a queries where I need to retrieve the temp_whois_result column like below. This column can be quite large up 3k characters.I have tried creating a non clustered index on results_com as follows and already have a index on domains_com.Index Key Columns: domainid,domainameIncluded columns: Last_Checked,App_Name,info_registrant, whois_serverdeclare @hours int set @hours = 3 SELECT TOP 200 domainame,Last_Checked,App_Name,info_registrant, whois_server,temp_whois_result FROM domains_com_joined_results_com WHERE last_checked >= dateadd(hour, -1*@hours, getdate()) order by last_checked descWhen I use “Display estimated execution plan” for this query it tells me I should include the column temp_whois_result . The problem if I include this column is the index grows to over 80gb which I think is far too large.If I don’t include the temp_whois_result in the index, then the query is instant.The thing I don’t understand is that the where condition is on the “last_checked” column, so I thought that the query would find the records in the index with a pointer to the records in the table where the temp_whois_result column would be retrieved.I have read quite a few articles on indexes, clustered v non clustered and am still not sure where I am going wrong.Any help appreciated.ThanksScott |
|
chadmat
The Chadinator
1974 Posts |
Posted - 2013-04-02 : 03:27:31
|
If you don't include everything in the select list, then a seek on a NC index has to do a Bookmark Lookup to the clustered index (Which can be expensive if there are many results). This is why you are getting a suggestion to include the temp_whois_result column.-Chad |
|
|
hummer123
Starting Member
5 Posts |
|
MIK_2008
Master Smack Fu Yak Hacker
1054 Posts |
Posted - 2013-04-02 : 05:40:21
|
I don't see a reason for index implementation. What you said is that "The problem I am having is with a queries where I need to retrieve the temp_whois_result column like below. This column can be quite large up 3k characters." Is it that the column being quite large upto 3k? I dont think its an issue, data can be in MBs and GBs .. Is this query performing bad? I mean takes longer time to execute and bring back the desired data? If so .. is there any index implemented on last_checked column?CheersMIK |
|
|
hummer123
Starting Member
5 Posts |
Posted - 2013-04-02 : 06:51:38
|
quote: Originally posted by MIK_2008
It maybe my lack of understanding of the SQL indexing. I will never perform a select query on the temp_whois_result, I just need to view the data. I was presuming that as the select is on the last_checked column then that is all I needed.The actual size of the index is not really an issue, the problem is that the results table gets 250k inserts per day so the indexes get fragmented very quickly so when I do an index rebuild it takes much longer.The index I have is below:Index Key Columns: domainid,domainameIncluded columns: Last_Checked,App_Name,info_registrant, whois_serverThanksScott |
|
|
MIK_2008
Master Smack Fu Yak Hacker
1054 Posts |
Posted - 2013-04-02 : 08:11:25
|
Well my original question is still unanswered Secondly, I don't think the specified index would be utlized by the query SELECT TOP 200 domainame,Last_Checked,App_Name,info_registrant, whois_server,temp_whois_result FROM domains_com_joined_results_com WHERE last_checked >= dateadd(hour, -1*@hours, getdate()) order by last_checked descEven if you include temp_whois_result column in the index. And yes if you add/include more columns in an index it would have direct impact on the DML transactions.Lastly, check for indexes that are not in-use, remove them at all in order to avoid unnecessary maintanance ... http://www.sqlservercentral.com/blogs/basits-sql-server-tips/2012/06/15/find-unused-indexes-using-sysdm_db_index_usage_stats/CheersMIK |
|
|
|
|
|
|
|