| Author |
Topic  |
|
|
vikas71082
Starting Member
India
2 Posts |
Posted - 05/14/2012 : 04:34:13
|
Hi
I am trying to tune the below query. In execution plan, max cost is of clustered index scan (48%). Query is taking 25 mins & selecting matching rows from millions of records from table. is there anyway to rewrite the query & tune it to perform faster.
SELECT a.market, a.advertiser_id, io.cuid, a.sc_id, a.io_id, a.package_id, a.publisher_id, a.inventory_source_id, a.line_item_id, a.ad_id, a.ad_layout_id, a.platform, CAST(a.impression_date AS date) AS impression_date, sum(a.impressions) as impressions , sum(a.clicks) as clicks, sum(a.view_based_actions)as view_based_actions , sum(a.click_based_actions) as click_based_actions, sum(a.contract_actions) as contract_actions , sum(a.platform_fee) as platform_fee, sum(a.data_fee) as data_fee, sum(a.total_fees) as total_fees, sum(a.mac) as mac , sum(a.cadreon_fee_calc) as cadreon_fee_calc, a.reporting_end_date , a.ad_click_url , sum(a.ad_exchange_fee) as ad_exchange_fee, sum(a.fixed_fee_adj) as fixed_fee_adj , sum(a.platform_fee_shortfall) as platform_fee_shortfall, CAST(io.io_start_date AS date) AS IO_Start_Date , CAST(io.io_end_date AS date) AS IO_End_Date , sum(DATEDIFF([day] , io.io_start_date, io.io_end_date) + 1) AS 'NO OF Days' , CASE WHEN DATEDIFF([day], io.io_start_date, a.impression_date) < 0 THEN 0 ELSE sum(DATEDIFF([day], io.io_start_date, a.impression_date) + 1) END AS no_of_days_imp , sum(1) AS row_cnt , sum(io_imp.tot_io_imp) as tot_io_imp , sum(io_imp_dt.tot_io_imp_dt) as tot_io_imp_dt, sum(io_imp.tot_io_actions) as tot_io_actions , sum(io_imp.tot_io_clicks) as tot_io_clicks, MONTH(a.reporting_end_date) AS Month, YEAR(a.reporting_end_date) AS Year, a.ad_format, sum(a.ad_completions_25) as ad_completions_25 , sum(a.ad_completions_50) as ad_completions_50, sum(a.ad_completions_75) as ad_completions_75 , sum(a.ad_completions_100) as ad_completions_100, sum(a.adjustment_amount) as adjustment_amount , a.hdr_id FROM dbo.ref_sub_campaign AS sc with (nolock) INNER JOIN dbo.ref_insertion_order AS io with (nolock) ON sc.cuid = io.cuid INNER JOIN dbo.site_performance_data AS a with (nolock)ON sc.sc_id = a.sc_id AND sc.dsp_name = a.platform INNER JOIN (SELECT rsc.cuid, s.impression_date, SUM(s.impressions) AS tot_io_imp_dt FROM iv_performance_test AS s with (nolock) INNER JOIN dbo.ref_sub_campaign AS rsc ON s.sc_id = rsc.sc_id AND s.platform = rsc.dsp_name GROUP BY rsc.cuid, s.impression_date) AS io_imp_dt ON io_imp_dt.cuid = io.cuid AND io_imp_dt.impression_date = a.impression_date INNER JOIN (SELECT rsc.cuid, SUM(s.impressions) AS tot_io_imp, SUM(s.clicks) AS tot_io_clicks, SUM(s.contract_actions) AS tot_io_actions FROM iv_performance_test AS s with (nolock) INNER JOIN dbo.ref_sub_campaign AS rsc ON s.sc_id = rsc.sc_id AND s.platform = rsc.dsp_name GROUP BY rsc.cuid) AS io_imp ON io_imp.cuid = io.cuid group by a.market,a.advertiser_id, io.cuid, a.sc_id, a.io_id, a.package_id, a.publisher_id , a.inventory_source_id, a.line_item_id, a.ad_id, a.ad_layout_id, a.platform, a.reporting_end_date,a.ad_click_url,io.io_start_date,io.io_end_date,a.impression_date ,a.ad_format,a.hdr_id
Really appreciates your help. Thanks in advance |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29138 Posts |
Posted - 05/14/2012 : 05:21:30
|
Check the execution plan and look for any kind of LOOKUPs (RID, BOOKMARK). Also check if implicit conversion is happening somewhere.
Check to see which indexes are used or not.
N 56°04'39.26" E 12°55'05.63" |
 |
|
|
Sachin.Nand
Flowing Fount of Yak Knowledge
2937 Posts |
Posted - 05/14/2012 : 15:38:31
|
Never heard implicit conversions being a problem.
After Monday and Tuesday even the calendar says W T F .... |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29138 Posts |
Posted - 05/14/2012 : 16:55:04
|
You must be kidding me? The answer is data type presedence.-- Create a table with two columns with equal content but different data types
CREATE TABLE dbo.ImplicitConversion
(
RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
String VARCHAR(11) NOT NULL,
Number INT NOT NULL
)
-- Populate the table with a preset value
INSERT dbo.ImplicitConversion
(
String,
Number
)
VALUES ('123', 123)
-- Populate the table with 6M+ random values
INSERT dbo.ImplicitConversion
(
String,
Number
)
SELECT Random AS String,
Random AS Number
FROM (
SELECT CHECKSUM(NEWID()) AS Random
FROM master.dbo.spt_values AS v
CROSS JOIN master.dbo.spt_values AS w
) AS d
-- Create proper indexes
CREATE NONCLUSTERED INDEX IX_String ON dbo.ImplicitConversion (String)
CREATE NONCLUSTERED INDEX IX_Number ON dbo.ImplicitConversion (Number)
-- Get row with proper data type (SEEK)
SELECT Number
FROM dbo.ImplicitConversion
WHERE Number = 123
-- Get row with improper data type (SEEK)
SELECT Number
FROM dbo.ImplicitConversion
WHERE Number = '123'
-- Get row with proper data type (SEEK)
SELECT String
FROM dbo.ImplicitConversion
WHERE String = '123'
-- Get row with improper data type (SCAN)
SELECT String
FROM dbo.ImplicitConversion
WHERE String = 123
-- Clean up
DROP TABLE dbo.ImplicitConversion
N 56°04'39.26" E 12°55'05.63" |
Edited by - SwePeso on 05/14/2012 16:56:53 |
 |
|
|
Sachin.Nand
Flowing Fount of Yak Knowledge
2937 Posts |
Posted - 05/14/2012 : 17:15:33
|
Is that what is called as implicit conversion ? Now you are kidding me..
After Monday and Tuesday even the calendar says W T F .... |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29138 Posts |
Posted - 05/14/2012 : 17:24:58
|
The implicit conversion occurs at the fourth select statements, due to datatype presedence. The WHERE clause decides that the column is a string and the argument is a number, and number has higher presedence so all rows for this column gets converted into number too. And as a number, the index cannot be used because the index has strings stored.
Look at the execution plans (last 4) and check the PREDICATE. For two of them you get IMPLICIT_CONVERSION but only one of these two turns into a scan instead of the expected SEEK.
N 56°04'39.26" E 12°55'05.63" |
Edited by - SwePeso on 05/14/2012 17:27:12 |
 |
|
|
Sachin.Nand
Flowing Fount of Yak Knowledge
2937 Posts |
Posted - 05/14/2012 : 17:31:40
|
quote: Originally posted by SwePeso
The implicit conversion occurs at the fourth select statements, due to datatype presedence. The WHERE clause decides that the column is a string and the argument is a number, and number has higher presedence so all rows for this column gets converted into number too.
Look at the execution plans (last 4) and check the PREDICATE. For two of them you get IMPLICIT_CONVERSION but only one of these two turns into a scan instead of the expected SEEK.
N 56°04'39.26" E 12°55'05.63"
So basically it is because of data type precedence and not true for all types of implicit conversions.
Never heard or saw when an implicit conversion from smallint to int causing any performance problem.
After Monday and Tuesday even the calendar says W T F .... |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29138 Posts |
Posted - 05/14/2012 : 18:24:32
|
No, not between TINYINT/SMALLINT/INT/BIGINT. However, between FLOAT and integers.
The reason I write implicit conversion in the first place, is that we don't know the data types for all columns used in the joins.
N 56°04'39.26" E 12°55'05.63" |
 |
|
|
vikas71082
Starting Member
India
2 Posts |
Posted - 05/14/2012 : 22:57:02
|
| Checked the execution plan and look for any kind of LOOKUPs (RID, BOOKMARK. There are lookup's but the cost is 0%. How to tune this query & which last 4 select statement you are talking about. there are only 3 nested select statement used in inner joins. Any quick help is really appreciated |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29138 Posts |
Posted - 05/15/2012 : 01:06:08
|
Can you post the actual execution plan?
N 56°04'39.26" E 12°55'05.63" |
 |
|
|
Transact Charlie
Flowing Fount of Yak Knowledge
United Kingdom
3425 Posts |
Posted - 05/15/2012 : 09:08:05
|
I notice that you are doing 2 very similar derived tables:
INNER JOIN (
SELECT
rsc.cuid
, s.impression_date
, SUM(s.impressions) AS tot_io_imp_dt
FROM
iv_performance_test AS s with (nolock)
INNER JOIN dbo.ref_sub_campaign AS rsc ON s.sc_id = rsc.sc_id AND s.platform = rsc.dsp_name
GROUP BY
rsc.cuid
, s.impression_date
)
AS io_imp_dt ON
io_imp_dt.cuid = io.cuid
AND io_imp_dt.impression_date = a.impression_date
INNER JOIN (
SELECT
rsc.cuid
, SUM(s.impressions) AS tot_io_imp
, SUM(s.clicks) AS tot_io_clicks
, SUM(s.contract_actions) AS tot_io_actions
FROM
iv_performance_test AS s with (nolock)
INNER JOIN dbo.ref_sub_campaign AS rsc ON s.sc_id = rsc.sc_id AND s.platform = rsc.dsp_name
GROUP BY
rsc.cuid
)
AS io_imp ON io_imp.cuid = io.cuid
(I put is some formatting)
The second derived table can be derived entirely from the first derived table. ( all it does is roll up the other information disregarding the date level. )
You could probably rewrite that bad boy. Maybe using GROUPING SETS or a similar technique. Or by just calculating the values for
tot_io_imp
tot_io_clicks
tot_io_actions
in whatever is parsing the output to provide the rollup for you.
Assuming it's an expensive data set to generate that should give a pretty good boost.
Apart from that -- as Peso says -- please provide the execution plan.
Charlie =============================================================== Msg 3903, Level 16, State 1, Line 1736 The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
| |
Topic  |
|
|
|