Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Remove Data from Large Recordset

Author  Topic 

drman
Starting Member

38 Posts

Posted - 2007-09-17 : 15:45:13
I am having some trouble trying to remove a large number of records from table1 when they are found in table2...

Create Procedure dbo.usp_RemoveItems
(
@AccountID int,
@PONumber int
)
As
Set NOCOUNT ON

Delete inventory
from inventory inner join Purchase
on inventory.UnitID = Purchase.UnitID
where inventory.AccountID = @AccountID and
(Purchase.PONumber = @PONumber and Purchase.Action = 'D')

Set NOCOUNT OFF

GO

The Purchase table could have 1000 items that are being joined to an inventory table that has 500000 items... thus a huge recordset...

I improved the code from what it was, but still has issues with this much data...

HELP PLEASE!!!

Yak

Zoroaster
Aged Yak Warrior

702 Posts

Posted - 2007-09-17 : 15:48:26
quote:
Originally posted by drman

I am having some trouble trying to remove a large number of records from table1 when they are found in table2...

Create Procedure dbo.usp_RemoveItems
(
@AccountID int,
@PONumber int
)
As
Set NOCOUNT ON

Delete inventory
from inventory inner join Purchase
on inventory.UnitID = Purchase.UnitID
where inventory.AccountID = @AccountID and
(Purchase.PONumber = @PONumber and Purchase.Action = 'D')

Set NOCOUNT OFF

GO

The Purchase table could have 1000 items that are being joined to an inventory table that has 500000 items... thus a huge recordset...

I improved the code from what it was, but still has issues with this much data...

HELP PLEASE!!!

Yak



What issue are you having specifically?



Future guru in the making.
Go to Top of Page

drman
Starting Member

38 Posts

Posted - 2007-09-17 : 16:01:30
It goes to 50 processor utilization and then it timesout when the preset timeout is reached (tested it with 10 minutes).

If I break the purchase into four groups, it processes, but slowly. The issues I think is having an inner join that has 500000 records.

Thanks

Yak
Go to Top of Page

Zoroaster
Aged Yak Warrior

702 Posts

Posted - 2007-09-17 : 16:48:41
quote:
Originally posted by drman

It goes to 50 processor utilization and then it timesout when the preset timeout is reached (tested it with 10 minutes).

If I break the purchase into four groups, it processes, but slowly. The issues I think is having an inner join that has 500000 records.

Thanks

Yak



In the past whenever we had to do large deletes we would have to get creative with it ( in production ). We either used cursors to delete in intervals and commit ever x number of rows processed or in extreme cases we used DTS to export everything that needed to stay, truncate the table after hours and the reimport. This was a while back before I knew better and before I knew about this site =) I hope one of the experts here can answer the "right" way to do this.



Future guru in the making.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-17 : 16:52:51
1,000 vs 500,000 is nothing.
We are operating on table with 1,900,000 vs 5,400,000 records and a similar delete takes about 4-5 seconds.

Are you having a lot if indexes on the Inventory table?
Also try table alias, as this
CREATE PROCEDURE dbo.usp_RemoveItems
(
@AccountID INT,
@PONumber INT
)
AS

SET NOCOUNT ON

DELETE i
FROM Inventory AS i
INNER JOIN Purchase AS p ON p.UnitID = i.UnitID
WHERE i.AccountID = @AccountID
AND p.PONumber = @PONumber
AND p.Action = 'D'



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

drman
Starting Member

38 Posts

Posted - 2007-09-18 : 08:36:20
Thanks for the response.

The Inventory table has 5 indexes - most of which are built on 1 field.

I can try the alias on the tables, but will they improve the performance or do they do more for the readability of the stored proc?

DRMAN
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-18 : 09:17:00
Try adding indexes covering:

inventory: AccountID, UnitID
Purchase: UnitID, PONumber, Action

See also: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=55210&SearchTerms=Bulk+Delete+of+Records

Kristen
Go to Top of Page

drman
Starting Member

38 Posts

Posted - 2007-10-22 : 09:15:37
Well I am back after a few weeks, the error (timeout) stopped for a couple of weeks and came back on friday??

Why it has been ok for a couple of weeks and now it went bad is beyond me, but it did force me to revisit it again and this is what I came up with...

The original SQL statement which would timeout on occasion:

Delete inventory
from inventory inner join Purchase
on inventory.UnitID = Purchase.UnitID
where inventory.AccountID = @AccountID and
(Purchase.PONumber = @PONumber and Purchase.Action = 'D')


The modified that takes less than a half a second.

Delete inventory
from inventory inner join Purchase
on inventory.UnitID = Purchase.UnitID
where inventory.AccountID = @AccountID and
Purchase.PONumber = @PONumber and Purchase.Action = 'D'

Apparently, the parentheses are causing some type of extra processing. I am not really sure why I put them there is the first place. Is there any reason not to have them?

Any thoughts?

Thanks again for all of the input and just being out there to bounce ideas off of.

DRMan
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-22 : 09:26:32
If you run the two delete queries after each other, all records to be deleted
is removed by first DELETE statement.

Then why are you so suprised second DELETE statement (same criterias) take ony a few seconds?

Hint: The records are already deleted...



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-22 : 09:29:36
If you are using transactions to rollback the deleted records between calls,
there is a great likelyhood that the query is cached if second DELETE is much faster.

Try to run them in opposite order.




E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

drman
Starting Member

38 Posts

Posted - 2007-10-22 : 09:45:49
Peso,

I am not calling both of the statements. The first one is how it was originally written and the second one is a modified copy of the first.

If I have the data in the tables and I run the first one, the statement will timeout (some of the time)

If I have the data in the tables and run the second one, the data is then deleted from the tables.

The ( and ) in the first are doiing something to how sql process the statement and I was wondering if anyone knew why?

DRMan
Go to Top of Page
   

- Advertisement -