Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 SQL bulk insert from mySQL filter out duplicates

Author  Topic 

2revup
Posting Yak Master

112 Posts

Posted - 2013-06-11 : 19:38:30
Guys I am doing a bulk insert from mysql, but I wondered if I could do a bulk insert and put a where around it. I want to do this so I don't insert duplicates.

Here is an example of the bulk insert

-- Create table first (Shows the structure in MSSQL)
CREATE TABLE [dbo].[UserRatings_all](
[CaseID] [int] NULL,
[Correspondence_ID] [bigint] NULL,
[UserID] [varchar](max) NULL,
[Rating] [int] NULL,
[Creation_date] [datetime] NULL
)

-- Ok lets insert data
insert into [UserRatings_all]
select * from openquery([support_cases],'select cr.CASE_ID, cr.CORRESPONDENCE_ID, CORRESPONDENCE_AGENT_ID, RATING, UPDATE_DATE FROM case_correspondence_rating cr
Left join d_case_details cd on cr.CASE_Id=cd.CASE_ID
WHERE UPDATE_DATE >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
limit 6000000;')

As you can see I am doing it from a date range, but there is room for overlap here, and there appears to be some, I need to ensure that the duplicates are killed, and I am only inserting fresh records.

I could do this on 'CORRESPONDENCE_ID' as this is a unique number. any ways that I could do this without bulk inserting into a temp table and then doing a insert from that temp table?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-12 : 00:51:00
what according to you are duplicates? can you illustrate with an example?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

2revup
Posting Yak Master

112 Posts

Posted - 2013-06-12 : 01:35:57
sure

Lets say I run the insert as above I would get the below from the date range

CaseID Correspondence_ID UserID Rating Creation_date
12911199333 5570802313 User1 4 6/10/2013
12911163573 5570245363 User2 5 6/10/2013
12911163573 5570515953 User3 5 6/10/2013
12898391011 25016198145 User4 2 6/10/2013
12910760153 5564211803 User1 5 6/10/2013
12898959621 25014914845 User9 2 6/10/2013
12899340751 25015936585 User1 5 6/10/2013
12911177063 5570523243 User6 5 6/10/2013

But my data I might have: (NOTE I have added the 'match' text to the right of a field that will match to the top, essentially a duplicate, I need these matches filtered out before inserting)

12911199333 5570802313 User1 4 6/10/2013 Match
12911229593 5571174303 User6 5 6/10/2013
12911228413 5571214293 asais 5 6/10/2013
12911163573 5570245363 User2 5 6/10/2013 Match
12911163573 5570515953 User3 5 6/10/2013 Match
12999894701 25016321555 User3 5 6/10/2013
12911213413 5571028223 User9 3 6/10/2013
12911213413 5571058303 User8 3 6/10/2013



Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-12 : 01:45:33
[code]
insert into [UserRatings_all]
select * from openquery([support_cases],'select cr.CASE_ID, cr.CORRESPONDENCE_ID, CORRESPONDENCE_AGENT_ID, RATING, UPDATE_DATE FROM case_correspondence_rating cr
Left join d_case_details cd on cr.CASE_Id=cd.CASE_ID
WHERE UPDATE_DATE >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
limit 6000000;')t
where not exists (select 1
from [UserRatings_all]
where CaseID = t.CaseID
and Correspondence_ID = t. Correspondence_ID
)

[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

2revup
Posting Yak Master

112 Posts

Posted - 2013-06-12 : 01:51:20
When this is run I am getting 'Invalid column name 'CaseID''
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-12 : 01:55:41
[code]
insert into [UserRatings_all]
select * from openquery([support_cases],'select cr.CASE_ID, cr.CORRESPONDENCE_ID, CORRESPONDENCE_AGENT_ID, RATING, UPDATE_DATE FROM case_correspondence_rating cr
Left join d_case_details cd on cr.CASE_Id=cd.CASE_ID
WHERE UPDATE_DATE >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
limit 6000000;')t
where not exists (
select 1
from [UserRatings_all]
where CaseID = t.Case_ID
and Correspondence_ID = t. Correspondence_ID
)

[/code]
------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

2revup
Posting Yak Master

112 Posts

Posted - 2013-06-12 : 01:58:21
The table I am pulling the data from
= cr.CASE_ID, cr.CORRESPONDENCE_ID

the table I am dumping the data to
= Case_ID, CORRESPONDENCE_ID

So it looks like based on what you have its correct.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-12 : 02:00:46
quote:
Originally posted by 2revup

The table I am pulling the data from
= cr.CASE_ID, cr.CORRESPONDENCE_ID

the table I am dumping the data to
= Case_ID, CORRESPONDENCE_ID

So it looks like based on what you have its correct.


then it should be


insert into [UserRatings_all]
select * from openquery([support_cases],'select cr.CASE_ID, cr.CORRESPONDENCE_ID, CORRESPONDENCE_AGENT_ID, RATING, UPDATE_DATE FROM case_correspondence_rating cr
Left join d_case_details cd on cr.CASE_Id=cd.CASE_ID
WHERE UPDATE_DATE >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
limit 6000000;')t
where not exists (
select 1
from [UserRatings_all]
where Case_ID = t.Case_ID
and Correspondence_ID = t. Correspondence_ID
)



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

2revup
Posting Yak Master

112 Posts

Posted - 2013-06-12 : 02:01:30
Dang your right I also missed that completely, and it worked like a charm... I didnt think I could do this with MySQL thank you, thank you , thank you!!!!!
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-06-12 : 02:03:19
welcome...
so far as its ANSI based it should work!

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page
   

- Advertisement -