Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Loving GUID's

Author  Topic 

henrikop
Constraint Violating Yak Guru

280 Posts

Posted - 2008-04-29 : 06:02:12
I'm reluctant to admit I'm in love with uniqueidentifiers AKA guid's.

However, this love comes with a cost because of the size and thus SQL Cache and thus IO reads (and fragmented index pages and so on)

My love has to do with generic programming and keeping multiple database updateable (adding core functionality with the flexibility to let users add and change data in tables that have core data in it as well). Also merging databases is easy or relate different entities by storing a metadata link table.

So, what I'm trying to do is making the disadvantages as small as possible.

So I'm using NEWSEQUENTIAL Id's and retrieve new made records with OUTPUT as TABLE.

So I changed the default CLUSTERED INDEX as NON-CLUSTERED.

I'm not fond on usings numerics for some primary keys and uniqueidentifiers for others.

What else can I do to keep using GUID's as primary keys with as less disadvantages as possible?

Henri
~~~~
There's no place like 127.0.0.1

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-29 : 06:11:23
GUID are 16 bytes wide.
That means you can store about 500 of them (single column index) in a page.
Using INT you can store about 2000 of them (single column index) in a page.

Which means

1) Index is only 1/4th of size
2) Searches theoretically is 4 times as fast.

I can't see the using GUIDs are better than identity columns.
I have run into some problem in the past due to GUID.

GUID values are only guaranteed to be unique within same server.
If you have a distributed solution you might run into problem. In this case make a smallint column where you store "source" for identity value (such as store or similar). Then let each [store ]database have it's own identity column.

Later when you gather all data in your datawarehouse or datamart, you will have 6 bytes identifying the unique record

A composite index of smallint column and int column is 6 bytes wide.
That means you can store about 1333 of them in a page.

Still using only 37.5% of the space needed for GUID and index size (search speed) is about 2.5 times faster (smaller).



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2008-04-29 : 12:07:12
quote:
Originally posted by Peso

2) Searches theoretically is 4 times as fast.
....
GUID values are only guaranteed to be unique within same server.

I believe it is guaranteed across the domain. By searches you mean table & range scans rather than singleton seeks right?
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-04-29 : 12:56:26
I too am a fan of GUIDs. What slight performance hits they incur are only significant on extremely large databases, and are often compensated by the ability to use more efficient coding and reduce network traffic.

e4 d5 xd5 Nf6
Go to Top of Page

henrikop
Constraint Violating Yak Guru

280 Posts

Posted - 2008-04-29 : 13:56:37
Inserting records can be a drag. And cache needs lots of RAM.

Imagine this scenario : a table with ChoiceLists and a table with Choices. This is very generic. Someone needs a Choicelist... he can make one. But sometimes some choices of some "core" choicelists are used in hard coded classes. With numerics this would be hard, you'll need to add an extra column to make a distinction for core choices but a GUID will always work.

Peso: They are unique.. at least when your server has a network card.

If had some computers generate GUID's and insert them in one table where guid was primary key. After a few days and 1.000.000.000's records further I was convinced.

And even if the change exist that it's not perfectly unique... I can solve that instance. Changes are that a coder or a user is making a worse mistake is a lot bigger than a duplicate GUID..

However, I'm convinced that GUID's have bad sides, but I'll make it a mission to make them as small as possible.

So please keep posting ideas!



Henri
~~~~
There's no place like 127.0.0.1
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2008-04-29 : 14:01:47
According to Microsoft, GUIDs "are not cryptographically random".

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-04-29 : 14:05:20
> Peso: They are unique.. at least when your server has a network card

acctually no. guids are calculated regardless of the network card.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
SSMS Add-in that does a few things: www.ssmstoolspack.com
Go to Top of Page

henrikop
Constraint Violating Yak Guru

280 Posts

Posted - 2008-04-29 : 14:51:30
Mladen: Can you back that up with a source? As far as I learned they do something with time and MAC address.
Tara: I don't care if they're random as long as they are unique.

I've never heard of SQL Server UNIQUEIDENTIFIER being a unpredicted duplicate. So I trust them (and it's my experience as well) to be unique. It's not a big deal. If changes are that a duplicate GUID could exists are less than say one on a billion, it's good enough for me. I can deal with that if there's one exception in a life time.

If a system is "broken" because of a predictable GUID then there's something else wrong .

Let's discuss about using a GUID in the most optimal way. I don't think natural keys is the way to go and GUIDS have some great advantages over NUMERIC keys so let's get rid of most appealing disadvantages of the GUID! Please

Henri
~~~~
There's no place like 127.0.0.1
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-04-29 : 15:03:24
http://en.wikipedia.org/wiki/Universally_Unique_Identifier
http://en.wikipedia.org/wiki/Globally_Unique_Identifier
the v1 version had used the MAC address.
but not since then.
i think the V4 version of guid generator has been in windows sine win2k sp3. but that's just from my memory and it isn't necessarily correct.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
SSMS Add-in that does a few things: www.ssmstoolspack.com
Go to Top of Page

henrikop
Constraint Violating Yak Guru

280 Posts

Posted - 2008-04-29 : 15:19:34
Thx for the links.

I'm serious about the GUID business. What other undiscussed disadvantages of using a GUID?

Henri
~~~~
There's no place like 127.0.0.1
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2008-04-29 : 15:28:00
The performance disadvantage can be pretty big depending on your queries and table sizes. It would be impossible for us to use as performance is very important.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-04-29 : 15:31:21
read this topic
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=53085

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
SSMS Add-in that does a few things: www.ssmstoolspack.com
Go to Top of Page

henrikop
Constraint Violating Yak Guru

280 Posts

Posted - 2008-04-29 : 16:00:20
Thx for that link!

--------------------------------------------------------------------------------
10) what datatypes are not recommended as index keys? Why not?
--------------------------------------------------------------------------------
The worst type for an index key is GUID, because of its randomness.

Yup - you should only have an index on a random GUID if you have a very good reason to, otherwise it causes nasty IO problems due to the random nature of additions to the b-tree.

____

I can understand that.

That nasty problem I have solved with using a NEWSEQUENTIALID()

So that is tackled.

What's next? If size is the only problem, I can live with that.

Henri
~~~~
There's no place like 127.0.0.1
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-04-29 : 20:35:11
Henrikop, the pros and cons of guids have been discussed to death on this forum, as well as every other forum that had ever made any mention of databases. There is really no need to hash out here, again, what is readily and copiously available via google.

e4 d5 xd5 Nf6
Go to Top of Page
   

- Advertisement -