SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Before going to interview with Microsoft
 New Topic  Reply to Topic
 Printer Friendly
Previous Page
Author Previous Topic Topic Next Topic
Page: of 3

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/03/2005 :  22:57:49  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:
Originally posted by byrmol
So given that my GUID will be effecting the split-pattern, is there anything I can do to ease the pain?



Ah - I see - I was assuming something generating realtime DATETIME values.

Apart from what eyechart suggested, if your business logic allows it you could insert in two stages. Stage 1 inserts into a side-table until the StartTime changes, then stage 2 sorts the contents of the side table and inserts into the real table. Bit hacky though.

In SQL Server 2005 your problems are solved - you can have 'included' columns in an index that allow you to cover queries but are not part of the index key. Having the GUID column as an included column will have exactly the semantics you desire.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 08/03/2005 :  23:31:48  Show Profile  Reply with Quote
quote:

In SQL Server 2005 your problems are solved - you can have 'included' columns in an index that allow you to cover queries but are not part of the index key. Having the GUID column as an included column will have exactly the semantics you desire.



Please tell me that applies to ALL versions of 2005 and not just the Expen^H^H^H^H^H Enterprise edition.

Unitl then, I'll suck up the pain, ensure the db maintenance routines are rock solid and place my faith in Moore's Law...

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11751 Posts

Posted - 08/04/2005 :  06:08:07  Show Profile  Visit spirit1's Homepage  Reply with Quote
thanx for the reply Paul. interesting reading...
i wonder how many people here could acctually answer those questions?
and what does a developer using SQL databases acctuly do? he develops what?
because i'm a .net developer using sql databases and these interview questions blew me away....

but i do have a question about guids...
we have a db in which all of the id's are guids. so they get joined on a lot...
we have NC indexes on them...
is that bad design because you say that guid isn't a good candidate for any index????
i always thought otherwise...

Go with the flow & have fun! Else fight the flow
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  10:24:27  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:

Please tell me that applies to ALL versions of 2005 and not just the Expen^H^H^H^H^H Enterprise edition.



All SKUs have it.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  10:35:58  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:

we have a db in which all of the id's are guids. so they get joined on a lot...
we have NC indexes on them...
is that bad design because you say that guid isn't a good candidate for any index????
i always thought otherwise...



You pay a high price in page splits when doing NC index maintenance - that's the biggest downside. That in turn leads to excessive logical fragmentation and poor scan performance.

However - GUIDs are by their very nature highly selective so if you're just doing singleton lookups then fragmentation doesn't matter.

Whether its a bad design depends on how exactly you're using the index and how much IO is wasted during inserts into the index - NC index maintenance can cripple a system if the NC indexes aren't designed carefully.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11751 Posts

Posted - 08/04/2005 :  10:50:09  Show Profile  Visit spirit1's Homepage  Reply with Quote
thanx paul...

however i'm having trouble with understanding a tech term in english (it's not my primary language ):
what are "singleton lookups"??


Go with the flow & have fun! Else fight the flow
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 08/04/2005 :  10:51:07  Show Profile  Reply with Quote
Am I being really THICK here ... doesn't any Natural [Primary] Key exhibit the same behaviour?

But have the same characteristics as a GUID: Very random, Highly selective, Rather long ...

Kristen
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  12:50:00  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:
Originally posted by spirit1
however i'm having trouble with understanding a tech term in english (it's not my primary language ):
what are "singleton lookups"??



A singleton lookup is where you expect to select a single record:

e.g. select customerId from orders where orderId = @orderNumber

In a query plan you'll see it called a seek.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  12:52:51  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:
Originally posted by Kristen

Am I being really THICK here ... doesn't any Natural [Primary] Key exhibit the same behaviour?

But have the same characteristics as a GUID: Very random, Highly selective, Rather long ...

Kristen



No - it entirely depends on what you choose as the natural key. If the natural key is a monotonically increasing orderId, its not the same as a GUID at all, but still has the high selectivity (in fact uniqueness) without the randomness.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11751 Posts

Posted - 08/04/2005 :  12:54:14  Show Profile  Visit spirit1's Homepage  Reply with Quote
ahhhh.... ok. thanx again
so basicaly using indexed guids for one to many relationships sucks...

Go with the flow & have fun! Else fight the flow
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 08/04/2005 :  14:21:06  Show Profile  Reply with Quote
"If the natural key is a monotonically increasing orderId .. "

Blimey ...

I wrote a reply, but now realise I need to express it from a different perspective.

Paul, if this is a straight forward answer then I'd appreciate your opinion. If you've got a tough day or whatever then just tell me sorry and move on, changing our App is not going to happen soon, so knowing the answer From the Horse Mouth is not urgent, and I'm sure you've got plenty of things on your desk to tackle. After all I do NOT have a data corruption problem!

If the answer is quick, or you've got the time, I'd appreciate the intellectual input.
So, having said that, I find that your "Supplier position" is different to my "Customer Expectation" (Have you been in that situation before?)

I was thinking about the varchar(30) "natural keys" that we give to the client. The client can put what they like in there.

Am I right that that will index as well/badly as a GUID?

(Sure, if they put "1", "2", "32, ... in there it will index nicely, but "Last name plus Tie-break number" won't be much good, but is there something special about GUID that makes them index "worse"?)

It seems that the debate waxes lyrical over Natural / Surrogate keys. I've never entered that debate, it was too Text Book Theoretical for me.

However, we seem to have moved to Real World Practical now and I'm wide awake all of a sudden!

I'll settle on a single scenario, hopefully representative:

I need to provide a Product Code/SKU column of, say, varchar(30) and it needs to be unique. Customers will put goodness knows what in there. Some will be increasing numbers, some will use the surname of the supplier's sales person, and a date (You think I'm joking?)

So .. I need to provide a "Find by product code" - that's an index on Product Code. And I'll make it unique to stop Duplicates.

But I also need to do a stack of joins from Products to Order Items, Stock Levels, Latest Price and so on.

Two ways to go:

1) Use Product Code as the PK, and as [part of] the FK to all the JOINED tables. In which case I'll use a CLUSTERED PK

2) Use an IDENTITY for the PK, and the FKss and also put a Unique Index on the Product Code to allow it to be searched reasonably quickly.

I'd never given this a second thought before. Using (1) allows me to "see" the Product Code in the Stock Level Record etc., and that makes debugging etc. easier, so that's what we did. I didn't think that performance might suffer.

Presumably Singleton lookup is going to be OK, either way, but I expect it isn't quite as simple as that is it?

What happens when I do

SELECT *
FROM OrderDetails OD
     JOIN Products P
         ON P.ProductCode = OD.ProductCode
WHERE OD.OrderNumber = '1234'

Is this still a singleton lookup, or needing a "tight index" in support of the JOIN?

I'm sure this is a Beginners Question, if so I'm sorry for burdening you with it.

However, I am really interested in what makes an efficient database - I couldn't give a monkey's for what the text book says, so if I can create some of my keys & indexes in a way to improve performance I'm keen to do so

Thanks

Kristen
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  18:29:29  Show Profile  Visit paulrandal's Homepage  Reply with Quote
These kind of things are a nice break from the usual day-to-day, and as I said in another thread, the s-n-r here is very high so I'm happy to spend time here.

quote:
Originally posted by Kristen

"If the natural key is a monotonically increasing orderId .. "

Blimey ...

I wrote a reply, but now realise I need to express it from a different perspective.

Paul, if this is a straight forward answer then I'd appreciate your opinion. If you've got a tough day or whatever then just tell me sorry and move on, changing our App is not going to happen soon, so knowing the answer From the Horse Mouth is not urgent, and I'm sure you've got plenty of things on your desk to tackle. After all I do NOT have a data corruption problem!

If the answer is quick, or you've got the time, I'd appreciate the intellectual input.
So, having said that, I find that your "Supplier position" is different to my "Customer Expectation" (Have you been in that situation before?)


A few times admittedly

quote:

I was thinking about the varchar(30) "natural keys" that we give to the client. The client can put what they like in there.

Am I right that that will index as well/badly as a GUID?

(Sure, if they put "1", "2", "32, ... in there it will index nicely, but "Last name plus Tie-break number" won't be much good, but is there something special about GUID that makes them index "worse"?)


Nope - your example surrogate key will be just as random as a GUID (unless there's some special distributuion of last name) and so you'll suffer the same problems.
quote:

It seems that the debate waxes lyrical over Natural / Surrogate keys. I've never entered that debate, it was too Text Book Theoretical for me.


I had a feeling that this thread may turn into such a debate
quote:

However, we seem to have moved to Real World Practical now and I'm wide awake all of a sudden!

I'll settle on a single scenario, hopefully representative:

I need to provide a Product Code/SKU column of, say, varchar(30) and it needs to be unique. Customers will put goodness knows what in there. Some will be increasing numbers, some will use the surname of the supplier's sales person, and a date (You think I'm joking?)

So .. I need to provide a "Find by product code" - that's an index on Product Code. And I'll make it unique to stop Duplicates.

But I also need to do a stack of joins from Products to Order Items, Stock Levels, Latest Price and so on.

Two ways to go:

1) Use Product Code as the PK, and as [part of] the FK to all the JOINED tables. In which case I'll use a CLUSTERED PK

2) Use an IDENTITY for the PK, and the FKss and also put a Unique Index on the Product Code to allow it to be searched reasonably quickly.

I'd never given this a second thought before. Using (1) allows me to "see" the Product Code in the Stock Level Record etc., and that makes debugging etc. easier, so that's what we did. I didn't think that performance might suffer.

Presumably Singleton lookup is going to be OK, either way, but I expect it isn't quite as simple as that is it?

What happens when I do

SELECT *
FROM OrderDetails OD
     JOIN Products P
         ON P.ProductCode = OD.ProductCode
WHERE OD.OrderNumber = '1234'

Is this still a singleton lookup, or needing a "tight index" in support of the JOIN?


As long as the index on ProductCode is clustered, it doesn't matter what you use as the key - the IO pattern will be the same - it'll be a singleton lookup.

The downside of what you're doing compared with an IDENTITYT is that you need to include this 30 byte key as a FK in all tables you want to join with, thus wasting a lot of space.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 08/04/2005 :  18:35:10  Show Profile  Reply with Quote
quote:
All SKUs have it.


That is fantastic. Cheers!

Kristen,

I hope I have assimilated Paul's message properly and this is how I see it...

I think the issue is not SELECT performance but INSERT/UPDATE/DELETE actions and the index maintenance needed to maintain index key order.

In your case, if the Product table has a lot of INSERT/UPDATE/DELETE actions then the issue arises, otherwise I would say not.

The OrderDetails has the "ugly" key as the secondary column in the index (OrderNumber ASC, ProductCode ASC) and therefore any orderdetail with more than 1 product will undergo the split-pattern problem.

Exactly like my issue.. that is solved in 2005!

quote:

I couldn't give a monkey's for what the text book says, so if I can create some of my keys & indexes in a way to improve performance I'm keen to do so


I am exactly the opposite.. follow the model and force/beg SQL Server to make a good job of it! Got to give Paul some challenges...Now... about that 900 byte index limit....

EDIT: .. nice Paul..

DavidM

A front-end is something that tries to violate a back-end.

Edited by - byrmol on 08/04/2005 18:51:41
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

USA
899 Posts

Posted - 08/04/2005 :  18:41:42  Show Profile  Visit paulrandal's Homepage  Reply with Quote
quote:

Now... about that 900 byte index limit....



You know about INCLUDEd columns in non-clustered indexes in SQL Server 2005 so you can cover more queries withouth bloating the key size? And also about the ability to store rows > 8k in length by automatically pushing variable-length columns off-row (in heaps, clustered, and non-clustered indexes)?

The key length will never be increased as it makes the index too deep by limiting fan-out severerly.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 08/04/2005 :  19:02:20  Show Profile  Reply with Quote
So what your telling me Paul, is that the SQL Server storage team have been working there guts out on the backend with very little press, while the CLR and XML teams get all the marketing and hype? Perhaps you could put a B-tree in a bikini.. that would make good copy... :-)

I wasn't in that much of a hurry for 2005, but this thread has changed that...

Once again thanks Paul.

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 08/05/2005 :  00:13:07  Show Profile  Reply with Quote
Paul and David thanks very much. You've set me thinking in a big way ... and we are right at the start of a 6 month development cycle, so just "right time" for some open heart surgery to the DB!

Sounds like our clients can expect a performance boost with SQL 2005 [I'm sure we have plenty of index scenarios exhibiting split-pattern], and that will speed up their upgrade policies, which in turn will allow us to start developing for, and using the new features in, SQL 2005 sooner (once client upgrades have reached critical mass we can arm-twist the remainder!)

Kristen
Go to Top of Page

sanjnep
Posting Yak Master

USA
191 Posts

Posted - 11/20/2006 :  15:34:29  Show Profile  Visit sanjnep's Homepage  Reply with Quote
One of my friend from my university got permanent position at Microsoft this year. Congratulation..
His dream comes true.
Go to Top of Page
Page: of 3 Previous Topic Topic Next Topic  
Previous Page
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.19 seconds. Powered By: Snitz Forums 2000