Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Insert: only data that hasn't been inserted prior!

Author  Topic 

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-05-30 : 06:32:04
Hi,

I have a query that takes data from one table and inserts into another (the table that the data is being selected from grows constantly), the query is part of a job that I have set up to run every minute. However, as it is the query inserts the same data plus any new data each time it runs.

How can I only insert the NEW data, since the last time the job ran?

Help greatly appreciated, my query is below:


use database1

insert into sampletable
(PMHost,
PMInstance,


SELECT

PMHost,
PMInstance,


FROM

[Warehouse].[sa_uaf_user].[K99_GEN]
PIVOT(MAX(PMValue) FOR PM_Counter in


(
[Disk Write Bytes/Sec],
[% Free Space]
))P

where PMObject = 'disk' and
([Disk Write Bytes/Sec] is not null or
[% Free Space]is not null)

group by PMHost, pminstance

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-05-30 : 07:03:38
[code]
use database1

insert into sampletable
(PMHost,
PMInstance,


SELECT

PMHost,
PMInstance,


FROM

[Warehouse].[sa_uaf_user].[K99_GEN]
PIVOT(MAX(PMValue) FOR PM_Counter in


(
[Disk Write Bytes/Sec],
[% Free Space]
))P

where PMObject = 'disk' and
([Disk Write Bytes/Sec] is not null or
[% Free Space]is not null)
and NOT EXISTS (SELECT 1 FROM sampletable WHERE PMHost = P.PMHost AND pminstance = P.pminstance )
group by PMHost, pminstance
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-05-30 : 08:25:13
quote:
Originally posted by mitin

Hi,

I have a query that takes data from one table and inserts into another (the table that the data is being selected from grows constantly), the query is part of a job that I have set up to run every minute. However, as it is the query inserts the same data plus any new data each time it runs.

How can I only insert the NEW data, since the last time the job ran?

Help greatly appreciated, my query is below:


use database1

insert into sampletable
(PMHost,
PMInstance,


SELECT

PMHost,
PMInstance,


FROM

[Warehouse].[sa_uaf_user].[K99_GEN]
PIVOT(MAX(PMValue) FOR PM_Counter in


(
[Disk Write Bytes/Sec],
[% Free Space]
))P

where PMObject = 'disk' and
([Disk Write Bytes/Sec] is not null or
[% Free Space]is not null)

group by PMHost, pminstance


There are couple of syntax errors in the query you posted which I assume are typo's that I have fixed below.

Is the combination of PMHost and PMInstance unique in the sampletable? If it is, then you can use the not exists clause.

But I suspect not, because you are pivoting the source table, and the PMValue could increase/decrease between two successive inserts, thus getting to a previously inserted value. If that indeed is true, and you still want to insert new rows that have been inserted since the last time, you will need to use soem other criteria to weed out the data that you don't want to insert.

Is there anything in the [Warehouse].[sa_uaf_user].[K99_GEN] table that can be used to determine what was inserted and what was not?
-- correcting sytnax errors
INSERT INTO sampletable
( PMHost ,
PMInstance
)
SELECT PMHost ,
PMInstance
FROM [Warehouse].[sa_uaf_user].[K99_GEN] PIVOT( MAX(PMValue) FOR PM_Counter IN ( [Disk Write Bytes/Sec],
[% Free Space] ) ) P
WHERE PMObject = 'disk'
AND ( [Disk Write Bytes/Sec] IS NOT NULL
OR [% Free Space] IS NOT NULL
)
GROUP BY PMHost ,
pminstance
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-05-31 : 06:01:46
Jamesk, the combo of PMHost and PMInstance is not unique in the sample table therefore I think you are correct, and I need a different way of doing this...

the only way i can think of is to alter the table the data is coming from so that a new column is added, and this can be populated when data is inserted from a row into the new table. How would I go about doing this?

What is the most straight forward way of doing what I need to do here?

Many thanks for the replies guys :)
Go to Top of Page

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-05-31 : 10:11:57
You will need some column in the [Warehouse].[sa_uaf_user].[K99_GEN] table that will let you identify columns that have been previously processed. For example, if that table has an identity column, you can use that as the marker. Or if there is a timestamp column, even though that is less reliable.

If you do have such a column, for example a timestamp column, what you will need to do is to add another column to the sampletable and insert the current timestamp into that column along with PMHost and PMInstance. Then your select query should be modified to take into account only entries since the last timestamp in sampletable.

I don't completely understand your business rules, so I am speaking in general terms; so this may not be exactly what you need.
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-03 : 06:59:51
I already have the timestamp column in both tables, so i just need to add something like:

WHERE timestamp > sampletable.timestamp

right?

I don't think the syntax is right here because the above doesnt work, but im sure the theory must be possible to implement in SQL.

It is as easy as I'm thinking here isn't it? How would this be done?

Thanks
Go to Top of Page

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-06-04 : 08:23:35
You would either need to join to the sample table or capture the maximum time before hand. So it would be something like this:
INSERT  INTO sampletable
( PMHost ,
PMInstance
)
SELECT PMHost ,
PMInstance
FROM [Warehouse].[sa_uaf_user].[K99_GEN] PIVOT( MAX(PMValue) FOR PM_Counter IN ( [Disk Write Bytes/Sec],
[% Free Space] ) ) P
WHERE PMObject = 'disk'
AND ( [Disk Write Bytes/Sec] IS NOT NULL
OR [% Free Space] IS NOT NULL
)
AND TIMESTAMP > (SELECT MAX(TIMESTAMP) FROM sampletable)
GROUP BY PMHost ,
pminstance
Test with sample data in a test environment to see if that is the logic that you in fact want to use.
Go to Top of Page

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-06-04 : 08:24:09
quote:
Originally posted by mitin

I already have the timestamp column in both tables, so i just need to add something like:

WHERE timestamp > sampletable.timestamp

right?

I don't think the syntax is right here because the above doesnt work, but im sure the theory must be possible to implement in SQL.

It is as easy as I'm thinking here isn't it? How would this be done?

Thanks



With the assumption that the K99_GEN.TimeStamp is recorded only upon "insert" of a new record into the K99_gen and not upon "update", then Yes, I think this will work for you. But you'll need to pass the SampleTable.timeStamp as to be the max value .. e.g.

declare @maxSampleTableDate datetime
SELECT @maxSampleTableDate =Max(timestamp) FROM SampleTable

Cheers
MIK
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-05 : 05:00:12
quote:
Originally posted by jun0

Thanks guys,

but, JamesK, when I try your solution and add the code that you suggested in red, I get the error message:

An aggregarte may not appear in te WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.


and MIK_2008, when I try your solution, i get the error message:

Must delare the scalar variable @maxsampletable


Why do I get these messages, does this help any towards what the actual solution will be?


It seems you're not using it in right way as James suggested

can you post used code?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-05 : 06:07:13
ok, heres the full query with the ammendment James suggested, why do I get the error that i can not use an aggregate function in the where clause? what needs to be done to solve this? :


use warehouse

insert into logicaldisksample
(PMHost,
PMInstance,
Timestamp01,
PMObject,
DiskWriteBytesSec,
PercFreeSpace,
FreeMegabytes,
SplitIOSec,
MDLReadsSec,
Threads,
InterruptsSec,
PacketsReceivedNonUnicastSec,
PercPinReadHits,
TransitionFaultsSec)

SELECT

PMHost,
PMInstance,
Timestamp,
PMObject,
max ([Disk Write Bytes/Sec]),
max ([% Free Space]),
max ([Free Megabytes]),
max ([Split IO/Sec]),
max ([MDL Reads/sec]),
max ([Threads]),
max ([Interrupts/sec]),
max ([Packets Received Non-Unicast/sec]),
max ([Pin Read Hits %]),
max ([Transition Faults/sec])


FROM

[Warehouse].[sa_itm_user].[K99_GENALARMPUSHALARM]
PIVOT(MAX(PMValue) FOR PM_Counter in


(
[Disk Write Bytes/Sec],
[% Free Space],
[Free Megabytes],
[Split IO/Sec],
[MDL Reads/sec],
[Threads],
[Interrupts/sec],
[Packets Received Non-Unicast/sec],
[Pin Read Hits %],
[Transition Faults/sec],
[Avg. Disk Queue Length],
[Connections Reset],
[Packets Received Discarded],
[Disk Reads/sec],
[Packets Outbound Discarded],
[Available Bytes],
[Disk Read Bytes/sec],
[Packets Sent/sec],
[Pool Nonpaged Bytes],
[Lazy Write Flushes/sec],
[Server Sessions],
[Connections Passive],
[Page Writes/sec],
[Current Bandwidth],
[Active Sessions],
[Pages/sec],
[% Usage Peak],
[Cache Faults/sec],
[% Idle Time],
[Page Faults/sec],
[% Disk Write Time],
[Data Flush Pages/sec],
[Packets Received Unknown],
[Disk Bytes/sec],
[connections Established],
[Copy Reads/sec],
[Processor Queue Length],
[Committed Bytes],
[Disk Writes/sec],
[% Interrupt Time],
[Avg. Disk Read Queue Length],
[System Cache Resident Bytes],
[% Processor Time],
[Packets Received Errors],
[Pool Nonpaged Failures],
[Pin Reads/sec],
[Packets Sent Non-Unicast/sec],
[Connections Active],
[Files Open],
[Fast Reads/sec],
[Bytes Transmitted/sec],
[Segments Retransmitted/sec],
[Read Aheads/sec],
[Data Map Hits %],
[System Calls/sec],
[Files Opened Total],
[Data Map Pins/sec],
[Cache Bytes Peak],
[Total Sessions],
[--Commit Limit],
[Copy Read Hits %],
[Pages Input/sec],
[% Committed Bytes In Use],
[Packets Received/sec],
[Data Maps/sec],
[Data Flushes/sec],
[Free System Page Table Entries],
[% DPC Time],
[Output Queue Length],
[Avg. Disk sec/Transfer],
[Pool Paged Bytes],
[Segments Sent/sec],
[Avg. Disk Write Queue Length],
[Logon/sec],
[Lazy Write Pages/sec],
[Cache Bytes],
[Pages Output/sec],
[Avg. Disk sec/Read],
[Processes],
[--MDL Read Hits %],
[File Directory Searches],
[Segments Received/sec],
[Page Reads/sec],
[Bytes Sent/sec],
[Work Item Shortages],
[% Registry Quota In Use],
[Bytes Received/sec],
[% Privileged Time],
[Bytes Total/sec],
[% Disk Read Time],
[Segments/sec],
[Packets Outbound Errors],
[Context Blocks Queued/sec],
[% Usage],
[Context Switches/sec],
[Avg. Disk sec/Write],
[Connection Failures],
[Fast Read Not Possibles/sec],
[Available MBytes]
))P

where PMObject = 'logicaldisk' and
([Disk Write Bytes/Sec] is not null or
[% Free Space]is not null or
[Free Megabytes] is not null or
[Split IO/Sec] is not null or
[MDL Reads/sec] is not null or
[Threads] is not null or
[Interrupts/sec] is not null or
[Packets Received Non-Unicast/sec] is not null or
[Pin Read Hits %] is not null or
[Transition Faults/sec] is not null)

and timestamp > (select max(timestamp) from logicaldisksample)

group by PMHost, pminstance, Timestamp, pmobject


Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-05 : 06:10:40
is sampletable the name of yourtable?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-05 : 06:25:19
no it is 'logicaldisksample', I have edited my reply above to reflect this. Thanks
Go to Top of Page

bandi
Master Smack Fu Yak Hacker

2242 Posts

Posted - 2013-06-05 : 06:36:02
Hi,

After PIVOT only allowable clause is ORDER BY.....
-- How that green marked code is allowed in below query....
SELECT
PMHost, ....
FROM [Warehouse].[sa_itm_user].[K99_GENALARMPUSHALARM]
PIVOT(MAX(PMValue) FOR PM_Counter in ()) P
WHERE ....
GROUP BY ...


EDIT: http://msdn.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
--That query should be
SELECT
PMHost, ....
FROM
(
SELECT
PMHost, ....
FROM [Warehouse].[sa_itm_user].[K99_GENALARMPUSHALARM]
PIVOT(MAX(PMValue) FOR PM_Counter in ()) P ) Temp
WHERE ....
GROUP BY ...



--
Chandu
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-05 : 06:46:25
Also that MAX() doesnt make any sense in select clause if you've already used PIVOT

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-05 : 09:20:57
Sorry but i don't quite get which bits of code need editing based on your replies above, could you repost the whole of my code with the exact locations for each edit.

I'm really sorry to ask for this but i'm still new and this is a bit confusing, could do with it being clearer.

Many thanks for help so far, greatly appreciate!
Go to Top of Page

bandi
Master Smack Fu Yak Hacker

2242 Posts

Posted - 2013-06-05 : 09:38:06
--May be this?
SELECT
PMHost,
PMInstance,
Timestamp,
PMObject,
[Disk Write Bytes/Sec],
[% Free Space],
[Free Megabytes],
[Split IO/Sec],
[MDL Reads/sec],
[Threads],
[Interrupts/sec],
[Packets Received Non-Unicast/sec],
[Pin Read Hits %],
[Transition Faults/sec]
FROM ((SELECT PMHost, PMInstance, Timestamp, PMObject,PMValue, PM_COUNTER
FROM [Warehouse].[sa_itm_user].[K99_GENALARMPUSHALARM]
) T1
PIVOT(MAX(PMValue) FOR PM_Counter in
(
[Disk Write Bytes/Sec],
[% Free Space],
[Free Megabytes],
[Split IO/Sec],
[MDL Reads/sec],
[Threads],
[Interrupts/sec],
[Packets Received Non-Unicast/sec],
[Pin Read Hits %],
[Transition Faults/sec]))P
)Temp
where PMObject = 'logicaldisk' and
([Disk Write Bytes/Sec] is not null or
[% Free Space]is not null or
[Free Megabytes] is not null or
[Split IO/Sec] is not null or
[MDL Reads/sec] is not null or
[Threads] is not null or
[Interrupts/sec] is not null or
[Packets Received Non-Unicast/sec] is not null or
[Pin Read Hits %] is not null or
[Transition Faults/sec] is not null)
and timestamp > (select max(timestamp) from logicaldisksample)


--
Chandu
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-05 : 10:11:48
bandi,

when i use the code that you just posted i get an error:

incorrect syntax near ')'


it is talking about the ')' just before 'Temp'

Why do I get this? by the way, 'Temp' doesn't highlight blue when i type it into the query window, not sure whether it should...

this is the error message i got prior to my last post, which prompted me for people to be clear on where to edit the code, I already edited it as above but got this error....

Thanks
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-06 : 04:00:24
anyone?
Go to Top of Page

bandi
Master Smack Fu Yak Hacker

2242 Posts

Posted - 2013-06-06 : 05:32:49
Check my previous post....

--
Chandu
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-06 : 08:57:46
I still don't see why there is this error?
Go to Top of Page

mitin
Yak Posting Veteran

81 Posts

Posted - 2013-06-07 : 04:39:17
sorry but can anyone else tell me why I have this error?
Go to Top of Page
  Previous Page&nsp;  Next Page

- Advertisement -