Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Old Forums
 CLOSED - General SQL Server
 Data Purging Takes More Time....

Author  Topic 

Prabu
Starting Member

6 Posts

Posted - 2005-11-17 : 06:40:30
Hi all,

We have a requirement from our client, that we have to delete all the respective information of an existing sub client for them..

There are about 59-60 tables that contains the related information,
so we have to delete from all these 59 tables, the mataching records.

The Approach that we have decided, goes like this, Initially we are fetcing up the respective customer Id's from the primary table. (a total of 4,00,000 records) to the temporary Parent table created by us.
Secondly we are fetching only 100 records from the Parent table and inserting into the Temporary Child table.

Then we have constructed a loop in such a way that, it takes that 100 records from the child table and starts deleting from the respective 59 tables.

This process what we have adopted takes around 2-3 mins per loop
(Since the table count is more, ie 59 and the records in each table crosses more than 60,000). Since considering the overall records of 4,00,000, it goes beyound days together.!!!!. Which is not accepted by our client.

If there is any other approach ?? that we can opt for in this kind of scenario to make the Purge process, more faster and efficient.

I have a lot of confidence in this group, and I exprienced the same in the past.
It would be great, if I got any help regarding this.
My Logic looks like below...
WHILE (noOfrecords in Parent table >0)
BEGIN
SET ROWCOUNT 100
INSERT 100 records in the child table from Parent table.



DELETE FROM Table1
FROM childtable MP,table TB
WHERE MP.clientID = TB.ClientID

DELETE FROM Table1
FROM childtable MP,table TB
WHERE MP.clientID = TB.ClientID

DELETE FROM Table1
FROM childtable MP,table TB
WHERE MP.clientID = TB.ClientID
END
Kindly Let me know if any more information needed.
Thanks And regards,
Prabu.P


Thanks & Regards,
Prabu.P

Kristen
Test

22859 Posts

Posted - 2005-11-17 : 07:05:59
I wouldn't bother with putting the "work to do" in the Parent table in lots of 100, I would just get on with the deleting, but regulate the size of the DELETE batches.

We use a self-adjusting loop for this type of "purging"; it has a Low and High threshold. When it runs too quickly the number of deletions per loop is increased a bit (e.g. 25%), when it runs too slow the number is heavily decreased (e.g. halved). It pauses for a few seconds between batches to allow other processes to resume. Thus if the machine is busy it works less hard, and if the server is idle it works harder! It looks something like this:

DECLARE @DEL_ROWCOUNT int,
@intLowLimit int,
@intHighLimit int,
@intRowCount int,
@intErrNo int,
@dtStart datetime

SELECT @DEL_ROWCOUNT = 200000, -- Initial "safe" value
@intLowLimit = 10, -- If delete runs for less then increase it 25%
@intHighLimit = 20, -- If delete runs for more then halve it
@intRowCount = 1, -- Force first iteration
@intErrNo = 0

WHILE @intRowCount > 0 AND @intErrNo = 0
BEGIN
SELECT @dtStart = GetDate()
SET ROWCOUNT @DEL_ROWCOUNT
DELETE D
FROM MyTable AS D
JOIN MyOtherTable AS O
ON O.MyPK = D.MyPK
WHERE O.clientID = 'FredBlogs'
SELECT @intErrNo = @@ERROR, @intRowCount = @@ROWCOUNT
-- Adjust ROWCOUNT according to whether batch was too slow, or too fast.
SELECT @DEL_ROWCOUNT =
CASE WHEN DATEDIFF(Second, @dtStart, GetDate())) < @intLowLimit
THEN @DEL_ROWCOUNT * 1.25 -- Increase 25%
WHEN DATEDIFF(Second, @dtStart, GetDate())) > @intHighLimit
THEN @DEL_ROWCOUNT / 2 -- Halve it
ELSE @DEL_ROWCOUNT -- No adjustment
END
WAITFOR DELAY '000:00:05' -- Time for other processes to "resume"
END
SET ROWCOUNT 0 -- Turn off batch size limits

Kristen
Go to Top of Page

Prabu
Starting Member

6 Posts

Posted - 2005-11-17 : 07:25:03
Thanks Mr.Kristen,
The approach what you have suggested is really good.
But in my case,please assume that, the Purging will run, only when the server is Idle.
And moreover the client requires a consistency, while deleting accross the tables.
In this Case, Kindly suggest me, whether going on Bulk delete on all the tables will really improve performance.

Once again thanks for your help..




quote:
Originally posted by Kristen

I wouldn't bother with putting the "work to do" in the Parent table in lots of 100, I would just get on with the deleting, but regulate the size of the DELETE batches.

We use a self-adjusting loop for this type of "purging"; it has a Low and High threshold. When it runs too quickly the number of deletions per loop is increased a bit (e.g. 25%), when it runs too slow the number is heavily decreased (e.g. halved). It pauses for a few seconds between batches to allow other processes to resume. Thus if the machine is busy it works less hard, and if the server is idle it works harder! It looks something like this:

DECLARE @DEL_ROWCOUNT int,
@intLowLimit int,
@intHighLimit int,
@intRowCount int,
@intErrNo int,
@dtStart datetime

SELECT @DEL_ROWCOUNT = 200000, -- Initial "safe" value
@intLowLimit = 10, -- If delete runs for less then increase it 25%
@intHighLimit = 20, -- If delete runs for more then halve it
@intRowCount = 1, -- Force first iteration
@intErrNo = 0

WHILE @intRowCount > 0 AND @intErrNo = 0
BEGIN
SELECT @dtStart = GetDate()
SET ROWCOUNT @DEL_ROWCOUNT
DELETE D
FROM MyTable AS D
JOIN MyOtherTable AS O
ON O.MyPK = D.MyPK
WHERE O.clientID = 'FredBlogs'
SELECT @intErrNo = @@ERROR, @intRowCount = @@ROWCOUNT
-- Adjust ROWCOUNT according to whether batch was too slow, or too fast.
SELECT @DEL_ROWCOUNT =
CASE WHEN DATEDIFF(Second, @dtStart, GetDate())) < @intLowLimit
THEN @DEL_ROWCOUNT * 1.25 -- Increase 25%
WHEN DATEDIFF(Second, @dtStart, GetDate())) > @intHighLimit
THEN @DEL_ROWCOUNT / 2 -- Halve it
ELSE @DEL_ROWCOUNT -- No adjustment
END
WAITFOR DELAY '000:00:05' -- Time for other processes to "resume"
END
SET ROWCOUNT 0 -- Turn off batch size limits

Kristen



Thanks & Regards,
Prabu.P
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-11-17 : 07:55:08
In that case you can take the WAITFOR out, so it runs continuously. But you should make sure that it COMMITs each delete block. I think it will run more quickly in smaller batches because there is less to log and "adjust" per batch. It will also keep any transaction log sizes under control (assuming you have frequent enough TLog backups)

We delete between 2,000,000 and 5,000,000 rows each night from our databases, and the server takes between 20 minutes (5,000,000 rows on a "decent" server) to 60 minutes (2,000,000 on a lousy server). That's including a 5 second WAITFOR every 200,000 rows, and BCP'ing all the data out before it starts, and then rebuilding the indexes and updating the statistics.

So depending on your hardware I don't think this should take you very long.

Kristen
Go to Top of Page

Prabu
Starting Member

6 Posts

Posted - 2005-11-17 : 08:51:09
Once again, thanks for your extended help..
I will try implementing your suggestion, and i will surely get back to you...
thanks,
Prabu.P


Thanks & Regards,
Prabu.P
Go to Top of Page
   

- Advertisement -