Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Extremely weird sql problem...

Author  Topic 

LordZod
Starting Member

5 Posts

Posted - 2003-12-04 : 17:26:57
Figure this...

INSERT INTO Recipient (RC_D_ID, RC_C_ID, RC_DATE)
SELECT DISTINCT 20000, subscriber.id_email, GETDATE()
FROM subscriber, Listmember
WHERE Listmember.LM_C_ID = subscriber.id_email
AND ((subscriber.banned = 0
AND subscriber.id_email < 5000000)
AND (Listmember.LM_L_ID = 253))

says ...

648842 rows affected

however...

SELECT DISTINCT 20000, subscriber.id_email, GETDATE()
FROM subscriber, Listmember
WHERE Listmember.LM_C_ID = subscriber.id_email
AND ((subscriber.banned = 0
AND subscriber.id_email < 5000000)
AND (Listmember.LM_L_ID = 253))

only returns 267564
in the Query Analyzer window.

And to top it all off... about 30 minutes after I
noticed it and investigated... it stopped happening.
Now both queries above affect the same number of rows.

This is the 4th time this has happened.
I haven't found any patters in log files or memory
settings or data types of fields.. .etc.

It's a brand new SQL Server 2000 Enterprise Edition.
6gb ram. Dual 3.06 ghz 1mb cache, Perc4/Dc Raid 10,
on 5x145gb scsi drives.

This problem wasn't happening on standard edition
that we migrated from about a month ago.

VyasKN
SQL Server MVP &amp; SQLTeam MVY

313 Posts

Posted - 2003-12-04 : 18:29:50
Is that an Accucast database for managing mailing lists or something? :-) I remember working with it a while ago, not that I liked it ;-)

No idea why you are seeing what you are seeing. But the number 648842 signify anything? Will you get that number, if you don't use DISTINCT in your query?

Are there any triggers on the table you are inserting into?

Have you compared the execution plan of your INSERT...SELECT and SELECT when this was happening?

--
HTH,
Vyas
http://vyaskn.tripod.com
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-12-05 : 07:28:28
Most likely normal explanation would be a trigger as Vyas suggested.

Other things to look at

Do you have at least sp3 installed.
Try an integrity check.

Are these base tables? i.e. not views with anything odd like functions in them (a function can cause duplicates to be missed by the server if it does something unexpected).
Try looking at the execution plan.
Try with maxdop = 1.


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

LordZod
Starting Member

5 Posts

Posted - 2003-12-05 : 10:17:17
Thanks for you help guys.

There are no triggers, views, or functions involved.

SP3 is installed.

Couldn't do the execution plan because it only happened for a short time. I cannot duplicate the problem now. But it has happened in the past 3 other times, and I'm sure it will happen again.

the number 648842 is the number of records the SELECT would return
if the line AND (subscriber.id_email < 5000000) was not there at all.

there are 648842 people on list 253, and 267564
on list 253 where there id_email is less than 5000000

The weird thing is... it is intermitent. it only happens sometimes. I've checked all the event view logs for weird things, all the sql logs for weird things happening right before or right after ... nothing. This is freaky stuff.

This server is Windows Server 2003 Enterprise Edition and the boot.ini says this:

[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect /PAE

I know I need the /3GB switch as well, I just haven't had a good lull in production to reboot the server, and I hate staying late. Performance seems to be ok. Much faster than the old machine.

Could it be the /3GB memory thing? I doubt it. but that is why I'm here. I don't know the anwser.

I am not familiar with maxdop = 1. What is that?

Thanks guys...


Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-12-06 : 03:19:31
maxdop = 1 is a hint you can put on a query to force it to use only one processor. There have been a few bugs involved with that but I very much doubt if this is one.

You can get rid of the brackets - again shouldn't male a difference.

INSERT INTO Recipient (RC_D_ID, RC_C_ID, RC_DATE)
SELECT DISTINCT 20000, subscriber.id_email, GETDATE()
FROM subscriber, Listmember
WHERE Listmember.LM_C_ID = subscriber.id_email
AND subscriber.banned = 0
AND subscriber.id_email < 5000000
AND Listmember.LM_L_ID = 253

Is this query hard coded withyin a stored procedure? i.e. no dynamic or embedded sql and no optional paths?

You could try "with recompile" on the SP to make sure it doesn't keep the query plan.

To get round the problem you can put it in a transaction and check the number of rows inserted and rollback and send an error if it's incorrect.


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-12-06 : 10:06:17
does an ANSI join help? (shot in the dark, i know, just throwing it out there)

- Jeff
Go to Top of Page

LordZod
Starting Member

5 Posts

Posted - 2003-12-10 : 12:12:45
It started happening again. It lasted longer this time. Here is the testing I did while the problem was happening.

INSERT INTO Recipient (RC_D_ID, RC_C_ID, RC_DATE)
SELECT DISTINCT 20000, subscriber.id_email, GETDATE()
FROM subscriber, Listmember
WHERE Listmember.LM_C_ID = subscriber.id_email
AND ((subscriber.banned = 0
AND subscriber.id_email < 5000000)
AND (Listmember.LM_L_ID = 253))

still produced all records where id_email was > or < 5000000 . around 600K
but if I do the this query

SELECT DISTINCT 20000, subscriber.id_email, GETDATE()
FROM subscriber, Listmember
WHERE Listmember.LM_C_ID = subscriber.id_email
AND ((subscriber.banned = 0
AND subscriber.id_email < 5000000)
AND (Listmember.LM_L_ID = 253))

it is only the records where id_email < 5000000 . I made two identical new tables called recipient_test
and subscriber_test and filled them with the same data as the others. Ran the above queries and they both worked correctly. If I use the original Recipient but the new subscriber_test table, it works, if I use Subscriber and Recipient_test, it works. I did not create indexes or Foreign key contstraints on the test tables. So that could be a possibility.

Another clue is I could reconfigure the server to Max degree of Parallelism = 1 and both worked during, then I could reconfigure it back to MAXDOP = 0 (or 2 or 3, I tried both) and it would go back to happening again.

So, it has something to do with MAXDOP, and maybe the reason it happens sometimes and not others is because of the cost of threshold for parallelism setting being 5. When it thinks it needs parallelism it happens, and when the server thinks it doesn't need parallelism, it doesn't.

It's a Dell 2650 Dual Intel 3.06mb 1mb cache Xeon , 6 gb ECC memory, 5 Maxtor 145 Gb 10K scsi drives configure as RAID 10 with Dell Perc4/DC (spanned raid 1) with 1 hot spare. Windows Server 2003 all patched up. SQL Server Enterprise Edition SP3a. Everything seems to be fine but this. I can't figure it out.

Any ideas?

Go to Top of Page

LordZod
Starting Member

5 Posts

Posted - 2003-12-10 : 12:26:32
PS. It is embedded in a Java program call Accucast. The Accucast software hasn't changed for two years. This only started happening when we migrated from SQL standard on W2K server to SQL Enterprise and Windows Server 2003. the database we recreated. New tables, new indexes, FK's etc. The two tables are in different filegroups. All indexes except the pk clustered indexes on each are in different filegroups as well.

Recipient table in recipient Filegroup
pk on recipient on rc_d_id, rc_c_id in recipient filegroup
other indexes on recipient are in recipient_ix filegroup

same set up on subscriber
id_email is pk

both pk are clustered.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-12-10 : 13:30:57
I would raise it with microsoft.
In the meantime you can include option maxdop 1 on the query.

I'll try getting someone to look at it.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

LordZod
Starting Member

5 Posts

Posted - 2003-12-10 : 15:02:34
Thanks NR,
I think it is a bug in SQL Server 2000 Enterprise Edtion on W2k3 because of this link someone in another forum showed me.

http://support.microsoft.com/default.aspx?scid=kb;en-us;274765

It looks like they have had a similar bug before in SQL 7 in an old service pack. More than likely, it is still there specific to my unique setup.

I hope there is a better solution, cuz I can't change the app that the embedded query is in to add the maxdop = 1 hint.

Is there a way to trigger it or something?
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-12-10 : 16:20:20
In that article
>> Microsoft has confirmed this to be a problem in SQL Server 7.0. This problem has been corrected in U.S. Service Pack 3 for Microsoft SQL Server 7.0. For information about how to download and install the latest SQL Server Service Pack, see the following Microsoft Web site:

So it will have been fixed for v2000 (hopefully).

Can you leave sql server using a single processor?

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2003-12-10 : 16:30:16
Do you have hyper-threading turned on?

MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2003-12-10 : 16:46:43
Hyper-threading can be found in the BIOS. I believe it is on by default on the hardware that you mentioned though, but it wouldn't hurt to check it except that you'll need to reboot to find out. One way to know is if you are seeing more CPUs in task manager than you actually have physically in the box. Windows will see twice the amount of CPUs when hyper-threading is turned on.

Tara
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-12-11 : 18:06:38
Did you raise an incident with Microsoft?


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

derrickleggett
Pointy Haired Yak DBA

4184 Posts

Posted - 2003-12-11 : 21:13:21
I would try two things. One is recreate the table on the same filegroup and try it. The second one is turning off hyper-threading. We had all kinds of problems on SQL Server Enterprise Edition with W2K3 and went back to W2K. It's just a bad idea to go onto a new operating system before the first service pack, regardless of what MS says about how stable and perfect it is.

Our environment was:

HP 8-way 2.4ghz Xeon
10gig RAM in Redundant array (8 gig)
EMC SAN 1tb total storage capacity
SQL Server 2000 Ent
W2K3

I would definitely turn off the hyperthreading though and try it though. Another question is do you have any special features enabled? I'm assuming you have AWE enabled. Are you using fibers, or anything out of the ordinary?

Lastly, have you tried flushing your proc cache and running it after you turn the MAX DOP up?

MeanOldDBA
derrickleggett@hotmail.com

When life gives you a lemon, fire the DBA.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2004-01-08 : 12:53:05
Did you raise an incident with Microsoft?

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page
   

- Advertisement -