Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 IS this the most efficient?

Author  Topic 

mrtweaver
Yak Posting Veteran

67 Posts

Posted - 2009-08-12 : 07:55:16
I have a very large table currently around 900K records. I am in the process of building a query that will only retrieve the first record for each grouping where the value in a particular column has changed. Currently I compare the table to itself with the compared table being current record +1. Here is the where clause:

WHERE t1.mechpg != t2.mechpg (t1 is the name of the database and t2 is the name +1 record)

Here is the data that is in that table:

43 12345 99887 0 0 0 0 8/10/2009 6:00
43 12345 99887 0 0 0 0 8/10/2009 6:05
43 12345 99887 0 0 0 1 8/10/2009 6:10
43 12345 99887 0 0 0 1 8/10/2009 6:15
43 12345 99887 0 0 0 1 8/10/2009 6:20
43 12345 99887 0 0 0 0 8/10/2009 6:25
43 12345 99887 0 0 0 0 8/10/2009 6:30
43 12345 99887 0 0 0 1 8/10/2009 6:35
43 12345 99887 0 0 0 0 8/10/2009 6:40

Now for what I would like on the output:

43 12345 99887 0 0 0 1 8/10/2009 6:10
43 12345 99887 0 0 0 0 8/10/2009 6:25
43 12345 99887 0 0 0 1 8/10/2009 6:35
43 12345 99887 0 0 0 0 8/10/2009 6:40

So you see every where where the value changes from a 0 to 1 or from a 1 to 0 I would like to retrive that record. The query I have seems to work ok but it takes an awfully long time to process.

I am definately thinking of indexes but not sure of the following question:

How Many indexes can you have on one table? Currently this table that the data is comming from has 3 indexes on it. Of course the mechpg column is not one of the items that is indexed and since it only changes from a 1 to 0 or from a 0 to a 1 can it be indexed?

What would be really helpful is if either someone on this board lived locally to central PA that I could get together with for a cup of coffee and discuss this or something along the lines of goto meeting. That way the person could see exactly what I have. But oh well.

Anyway as you migh have guessed I am definately new to this programming and dont fully comprehend all the details and specifications which is why I am asking the guru community for assistance.

What is the best way to accomplish this task so that the query runs at the fastest possible speed and returns the proper data requested.

Please be as specific as possible so that a SQL virgin can wrap their tiny mind around the solution. Thank you.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-08-12 : 08:57:24
I blogged about this yestereday.
You can use a "Running streak" algorithm here
http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Another-running-streaks-algorithm.aspx



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mrtweaver
Yak Posting Veteran

67 Posts

Posted - 2009-08-12 : 13:13:26
I did not try the method that you blogged about because of lack of knowledge about the way you have it laid out and the 100000 size tally file you talk about. As I said in my OP the current size of my table is 900000+ so does that mean that I would need a tally file of 900000+ or should it be good to go with only 100000?

I did however try the other method, the one in which your blog points to. That one did not work for me. If I changed the fields and data source to match my circumstance and use the where clause: WHERE G.Result <> GR.Result AND G.Date <= GR.Date). I get no results because of SERVER TIMEOUT ERRORS. If I take off the greater than or equal to for the date portion, I get results but it takes an extremely long time. Dont know why.


quote:
Originally posted by Peso

I blogged about this yestereday.
You can use a "Running streak" algorithm here
http://weblogs.sqlteam.com/peterl/archive/2009/08/12/Another-running-streaks-algorithm.aspx



N 56°04'39.26"
E 12°55'05.63"


Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-08-12 : 13:20:04
This should work fast!
DECLARE	@Sample TABLE
(
Col1 INT,
Col2 INT,
Col3 INT,
Col4 INT,
Col5 INT,
Col6 INT,
Col7 INT,
Col8 DATETIME
)

INSERT @Sample
SELECT 43, 12345, 99887, 0, 0, 0, 0, '8/10/2009 6:00' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 0, '8/10/2009 6:05' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 1, '8/10/2009 6:10' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 1, '8/10/2009 6:15' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 1, '8/10/2009 6:20' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 0, '8/10/2009 6:25' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 0, '8/10/2009 6:30' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 1, '8/10/2009 6:35' UNION ALL
SELECT 43, 12345, 99887, 0, 0, 0, 0, '8/10/2009 6:40'

SELECT w.Col7,
MIN(w.Col8)
FROM (
SELECT Col7,
Col8,
ROW_NUMBER() OVER (ORDER BY Col8) AS recID
FROM @Sample
) AS w
INNER JOIN (
SELECT Col7,
Col8,
ROW_NUMBER() OVER (PARTITION BY Col7 ORDER BY Col8) AS recID
FROM @Sample
) AS x ON x.Col7 = w.Col7
AND x.Col8 = w.Col8
WHERE w.recID > x.recID
GROUP BY w.Col7,
w.recID - x.recID
ORDER BY MIN(w.Col8)


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mrtweaver
Yak Posting Veteran

67 Posts

Posted - 2009-08-13 : 08:57:25
Peso, I was wondering, I dont really know that much about stored proceedures but if I give you the query and the particulars could you assist me in writing a stored proceedure that will do the trick?

Here is a query that is currently a view but I want to make it a stored proceedure so that I can send over via variables.
-----------------------------------------------------------
SELECT MACHINE, JOBTASK, SHIFT, START, STOP, MECHPG, MECH1, MECH2, Prod_date, CASE WHEN t1.mech1 = 0 AND
t1.mechpg = 1 THEN 'LiteOn' WHEN t1.mech1 > 0 AND t1.mechpg = 1 THEN 'MechResp' WHEN t1.mech1 > 0 AND t1.mechpg = 0 AND
status = 'byps' THEN 'Bypass' ELSE t1.status END AS verb1
FROM dbo.DJT_Prodtrack AS t1
WHERE (MECHPG = 1) AND (Prod_date > CONVERT(DATETIME, '2009-08-01 06:00:00', 102)) OR
(MECHPG = 0) AND (MECH1 <> 0) AND (Prod_date > CONVERT(DATETIME, '2009-08-01 06:00:00', 102))

---------------------------------------------------------------
The two underlined areas are datetime fields in case you did not see that already and these would be the fields that would be sent over via variables.

Now I am sure you also need to know the layout of the table. So here that is:

Machine = INT
Shift = INT
JOBTASK = Big INT
Start = datetime
Stop = datetime
mechpg = int
mech1 = int
mech2 = int
prod_date = datetime

-----------------------------------------------------------

I think that is everything but if you have further questions please ask.

BTW - If someone else wishes to answer this as well they are more than welcome to. I only picked Peso because he has been the one that has been responding already. Thanks all and have a great day.

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-08-13 : 09:19:51
Post your sample data again, with correct column headers.
See my example posted 08/12/2009 : 13:20:04


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-08-13 : 09:22:30
[code]SELECT MACHINE,
JOBTASK,
SHIFT,
START,
STOP,
MECHPG,
MECH1,
MECH2,
Prod_date,
CASE
WHEN mech1 = 0 AND mechpg = 1 THEN 'LiteOn'
WHEN mech1 > 0 AND mechpg = 1 THEN 'MechResp'
WHEN mech1 > 0 AND mechpg = 0 AND status = 'byps' THEN 'Bypass'
ELSE status
END AS verb1
FROM dbo.DJT_Prodtrack
WHERE Prod_date > '2009-08-01 06:00'
AND (MECHPG = 1 OR MECHPG = 0 AND MECH1 <> 0)[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mrtweaver
Yak Posting Veteran

67 Posts

Posted - 2009-08-13 : 10:01:52
This select statement looks exactly like the one I posted with the exception of a few items changed around.

WHat I am looking for is a stored proceedure where I can send over some variables to limit the results of this query. I am thinking something along the lines of what you posted on 8/12/2009. The only thing is I dont know exactly how to lay out the stored proceedure and declare the variables so that the query given below will be limited to the dates sent over from the application.

I thought by giving you the query that I have that creates a view, and giving you the headers and data types that that would be enough to write a stored proceedure that I can adapt to my application. As I have said I am really new to this and not as well educated as some on this site. I know what I want to do but I dont know how to get there from here. I know there are @ signs that I use and some things such as ANSI OFF and such but dont know the correct layout. THis is why I am asking for assistance.

This query given here retrieves data from a table that is 2108905 records long, it has data in it from all the way back in Jan 2009. When I run this query from my application I want to be able to send over via variables the start and end dates to limit the results. Does this make it more clear?

quote:
Originally posted by Peso

SELECT	MACHINE,
JOBTASK,
SHIFT,
START,
STOP,
MECHPG,
MECH1,
MECH2,
Prod_date,
CASE
WHEN mech1 = 0 AND mechpg = 1 THEN 'LiteOn'
WHEN mech1 > 0 AND mechpg = 1 THEN 'MechResp'
WHEN mech1 > 0 AND mechpg = 0 AND status = 'byps' THEN 'Bypass'
ELSE status
END AS verb1
FROM dbo.DJT_Prodtrack
WHERE Prod_date > '2009-08-01 06:00'
AND (MECHPG = 1 OR MECHPG = 0 AND MECH1 <> 0)



N 56°04'39.26"
E 12°55'05.63"


Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-08-13 : 10:15:43
Can you PLEASE provide some sample data in the format I posted 08/12/2009 : 13:20:04?
I have NO clue which column name belongs to which column value!



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page
   

- Advertisement -