| Author |
Topic |
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-07 : 15:27:21
|
| Hello, newbie here...I have a huge dataset in MS SQL Server, over 11 million records of machine data taken every four seconds. I really don't need samples of this data at this interval. I'd like to run a query to retrieve my data for every 30 minute interval. I know enough SQL and DTS to SELECT was fields I want, and how to direct it to a CSV file, but I'm not sure how to manipulate the TimeStamp field in the query to only pull data every 450 records.TIA |
|
|
dataguru1971
Master Smack Fu Yak Hacker
1464 Posts |
Posted - 2007-12-07 : 15:39:51
|
This isn't really clear, but you will have to provide more information.You want each 450th successive record for a specific timestamp, every thirty minutes? Poor planning on your part does not constitute an emergency on my part. |
 |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-08 : 10:22:11
|
your system grows by 11m rows every 4 seconds? if so you are not a newbie. you are a god. elsasoft.org |
 |
|
|
dataguru1971
Master Smack Fu Yak Hacker
1464 Posts |
Posted - 2007-12-08 : 10:30:10
|
quote: Originally posted by jezemine your system grows by 11m rows every 4 seconds? if so you are not a newbie. you are a god. elsasoft.org
 Poor planning on your part does not constitute an emergency on my part. |
 |
|
|
georgev
Posting Yak Master
122 Posts |
Posted - 2007-12-08 : 16:55:56
|
If you only want data for every 30minute interval...[CODE]SELECT dateStampField , dataFROM myTableWHERE DatePart(hh, dateStampField) IN (0, 30)[/CODE] George<3Engaged! |
 |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-12-08 : 23:50:48
|
here's a generic solution to grab every 450th row from any table:select * from (select ROW_NUMBER() OVER(ORDER BY MyColumn) num, MyColumn, OtherColumn, AndSoOn from MyTable) awhere a.num % 450 = 0 --just change 450 to something else if you want a different interval. elsasoft.org |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-10 : 07:27:30
|
quote: Originally posted by jezemine your system grows by 11m rows every 4 seconds? if so you are not a newbie. you are a god. elsasoft.org
Sorry I wasn't clear. It's not "my" system, but one that I have to pull data from. It's a power plant monitoring system (SCADA) and it writes sensor values every four seconds. There are 125+ sensors on each engine, and there are nine engines running. So there's ALOT of information being stored. My eleven million row dataset is only four days' worth.We need to evaluate the sensor data, but not that precisely. One data point every 30 minutes would be fine. So since the system writes data 15 times per minute, and I only need every 30 minutes, I figure about every 450th (30 * 15) record would work. |
 |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-12-10 : 07:32:25
|
| maybe you should average the data in a 30 minute interval and poll that?or create a standard deviation on the data.. or something like that.i know that 30 minutes sensor data isn't a very reliable indicator for anything in a 4 second refresh interval..._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenpSSMS Add-in that does a few things: www.ssmstoolspack.com |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-10 : 07:43:12
|
86,400 seconds per day. Sensor triggers every fourth second. That is 21,600 measures per day.125 sensors per engine, nine engines gives 1,125 sensor values every four seconds, not 11 million.21,600 x 1,125 = 24,300,000 records per day.4 days records is 97,200,000 records... E 12°55'05.25"N 56°04'39.16" |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-10 : 09:51:13
|
quote: Originally posted by Peso 86,400 seconds per day. Sensor triggers every fourth second. That is 21,600 measures per day.125 sensors per engine, nine engines gives 1,125 sensor values every four seconds, not 11 million.21,600 x 1,125 = 24,300,000 records per day.4 days records is 97,200,000 records... E 12°55'05.25"N 56°04'39.16"
Well, yes, but they're not running 24x7... it's a peak shaving application where we only run when the windmills slow down. |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-10 : 10:02:31
|
quote: Originally posted by spirit1 maybe you should average the data in a 30 minute interval and poll that?or create a standard deviation on the data.. or something like that.i know that 30 minutes sensor data isn't a very reliable indicator for anything in a 4 second refresh interval..._______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenpSSMS Add-in that does a few things: www.ssmstoolspack.com
Well, the system has real-time alarms for when the sensor data goes out of spec. We're trying to measure trends, and correlations betweens trends of sensors. Our analyst has suggested the 30 minute window (this all goes to another app for analysis). The data is too massive to run against all data points. I have over 100 DVD's of data on my desk, and my SQL Server installation is choking on just one DVD's worth... I loaded 4 days of data to get my 11 million records in one table (these engines did not run so much during this time). |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-10 : 10:14:25
|
quote: Originally posted by dataguru1971 This isn't really clear, but you will have to provide more information.
Hopefully this is more clear:There is one table in my database with all the info I need. I need all the fields in the table, hence the '*'. Here is the SQL I plug into DTS to create my CSV file.SELECT * FROM [BC_Backup1].[Bizwareuser].[RSDataPointSample] WHERElRSDAtaPointId='1001223'This will give me about 19000 records for a single sensor (lRSDataPointId = 1001223). What I really need is about 40 records for this sensor (19000 / 450), and the 40 records for the other 89 sensors.The lRSDataPointId's run from 1001949 to 1002088.I would hope that a really good SQL guru (not me, natch) could write one statement to loop thru the entire record set and do this in one pass. |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2007-12-10 : 19:05:14
|
Does Goerge's solution not work (chaning hh (hour) to mi (minute)?SELECT * FROM [BC_Backup1].[Bizwareuser].[RSDataPointSample] WHERE DATEPART(mi, DateField) IN (0,30) |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-11 : 07:53:47
|
quote: Originally posted by Lamprey Does Goerge's solution not work (chaning hh (hour) to mi (minute)?SELECT * FROM [BC_Backup1].[Bizwareuser].[RSDataPointSample] WHERE DATEPART(mi, DateField) IN (0,30)
Yes, it worked very nicely. Can I also add an 'AND' statement to the WHERE clause to retrieve only lRSDataPointId's between 1001949 to 1002088? |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-11 : 07:58:55
|
Try. E 12°55'05.25"N 56°04'39.16" |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-11 : 08:44:45
|
| George's solution is exactly what I asked for, data points every 30 minutes. However, there are 15 occurrences during that one minute period (data points every 4 seconds, so 15 readings each minute). Is there a parameter to the DATEPART function that would retrieve only one reading (of the 15), then move on to the next 30 minute interval?Sorry for asking so many newbie questions. I'm supposed to have a SQL expert on my team, but so far he's been letting me down on his assignments. |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-11 : 08:49:41
|
quote: Originally posted by Peso Try. E 12°55'05.25"N 56°04'39.16"
Not too shabby. I added the extra <= and >= statement to the WHERE and reduced my data output nicely. Thanks! |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-11 : 08:58:22
|
quote: Originally posted by BirdWatcher George's solution is exactly what I asked for, data points every 30 minutes. However, there are 15 occurrences during that one minute period (data points every 4 seconds, so 15 readings each minute). Is there a parameter to the DATEPART function that would retrieve only one reading (of the 15), then move on to the next 30 minute interval?Sorry for asking so many newbie questions. I'm supposed to have a SQL expert on my team, but so far he's been letting me down on his assignments.
So, I'm gathering that a tweak of the DATEPART function would give me what I want. If the "IN" parameter to give me every 30 minutes was "(0,30)", should I change it to "DATEPART(ss, DateFld) IN (0,0,1800)" to get one second's reading every 30 minutes? |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-12 : 07:38:07
|
Ok, I've almost got it. Here's the statement that returns 41,000 records:SELECT * FROM [BC_Backup1].[Bizwareuser].[RSDataPointSample] WHERE (DATEPART(mi, tTimeStamp) IN (0,30) AND (lRSDataPointId >='1001949' AND lRSDataPointId<='1002088')) What I get is all the data on the hour or half-hour, fifteen records each. So for example, at 10:00am, I have records for 10:00:02am, 10:00:06am, 10:00:10am, etc.... every four seconds. I would write another DATEPART statement using the 'ss' parameter, but the seconds are not set in stone, they change, depending on when the engine started. So for my above example, it was at 10:00:02, but the next dataset might be at 10:00:05.Can I use an aggregate function (MAX?) to just retrieve ONE of the 15 records at 10:00am? I tried it, but rec'd an error. Any other suggestions? |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-12-12 : 07:43:01
|
Are you using SQL Server 2000 or SQL Server 2005? E 12°55'05.25"N 56°04'39.16" |
 |
|
|
BirdWatcher
Starting Member
17 Posts |
Posted - 2007-12-12 : 08:42:33
|
quote: Originally posted by Peso Are you using SQL Server 2000 or SQL Server 2005? E 12°55'05.25"N 56°04'39.16"
2000 I believe. SQL Server Enterprise Manager v8.0. The copyright notice says 2000. |
 |
|
|
Next Page
|