Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Delete matching rows from one table to another?

Author  Topic 

aiken
Aged Yak Warrior

525 Posts

Posted - 2006-12-02 : 20:30:23
I've got a very large table (a billion rows or so), where I need to delete some of them (about 90 million). Unfortunately, there is no index matching the criteria for the deletion, and I can't create one.

What I have done is created a second table with the rows that need to be deleted... but now I'm stumped on how to use that. There is no surrogate key on the large table, or this would be easy (I also can't add a surrogate key to the big table).

Here's the rough idea:
create table verybig(a int,b int,c smallint)

create table deleterows(a int,b int)

insert into verybig(a,b,c)
select 1,2,500
union all
select 1,3,1500
union all
select 2,3,500
union all
select 2,4,1500

insert into deleterows(a,b)
select a,b
from verybig
where c>1000


That should put the pairs (1,3) and (2,4) into the deleterows table. Now, how in the world do I delete from "verybig" only those rows where both a and b match a row from "deleterows"?

I'm sure this is conceptually easy and I'll smack myself when someone posts the obvious, but at this point the only approach I can see is a cursor... and 90 million individual deletes sounds painful.

Ideas?

-b

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2006-12-02 : 23:24:36
Why can't you join verybig and deleterows tables on column a and b and delete only joined records?

Harsh Athalye
India.
"Nothing is Impossible"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2006-12-03 : 02:44:17
"delete 90 million rows from a table of a billion rows or so"

Needs to be right first time then ...

"where I need to delete some of them (about 90 million)"

You may need to put that delete in a loop and restrict the number of rows deleted on each iteration. Otherwise its a Big Boy to run, and a Big Boy to rollback if you get impatient!

But I expect you know all that already ...

"how in the world do I delete from "verybig" only those rows where both a and b match a row from "deleterows"?"

Are you just trying to delete the specific rows where C>1000, or any row [based on a & b] where one such row has C>1000?

If it is the specific row I would store the PK columns for the row in [deleterows] and just do a join.

If multiple-matching rows then you need an index on a & b, and probably a second pass to get the PK of all such matching rows into YetAnotherDeleteRowsTable, and THEN join that to [verybig] for the delete operation.

Kristen
Go to Top of Page
   

- Advertisement -