Combined Columns in table to be unique

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Combined Columns in table to be unique

Author

Topic

FrOzeN89
Starting Member

5 Posts

Posted - 2013-02-09 : 23:14:50

Hi there,

I have a databse I'm building similar to:

CREATE TABLE Products (
  Products_Id   integer PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE,
  ItemNumber    integer NOT NULL UNIQUE,
  Description   nvarchar(255) NOT NULL
);

CREATE TABLE Sales (
  Sales_Id      integer PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE,
  ItemNumber    integer NOT NULL,
  TheDate       datetime NOT NULL,
  Volume        real NOT NULL,
  /* Foreign key */
  FOREIGN KEY (ItemNumber)
    REFERENCES Products(ItemNumber)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION
);

In the Sales table I would like to create a constraint to limit (ItemNumber + TheDate) to be UNIQUE when combined, but can be duplicated providing either value is different.

Also, more importantly, is a constraint like this going to slow data entry down? I intend of having up to 2-3 million rows within this table. Approximately 5,000 entries per day adds almost 2 million rows per year. Would it be better to have more thorough code to prevent entering duplicates?

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2013-02-10 : 02:11:34

If the uniqueness is part of the logical design ,a unique contstraint is more effective in documenting the logical design.
Adding indexes do slow INSERTS down , but must be weighed up against whether they are assisting in overall query performance.

Jack Vamvas
--------------------
http://www.sqlserver-dba.com

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-02-10 : 06:25:37

I would strongly second Jack's opinion. The index would slow down inserts a little bit, but in almost all cases, the price would be well worth paying. Using a uniqueness constraint guarantees the uniqueness of the data. Trying to ensure the uniqueness at the entry point via code can break for a number of reasons: there may be multiple entry points, someone may add new entry points in the future (and they may not be aware of the uniqueness requirement), there may be multiple clients inserting the data (in which case they will need to coordinate the insertion to avoid duplicate data) etc.

FrOzeN89
Starting Member

5 Posts

Posted - 2013-02-10 : 06:54:39

Thanks, I have added the constraint using the CREATE UNIQUE INDEX method.

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2013-02-10 : 23:11:50

If you're going to go the index route (rather than constraint route) you might as well give consideration to the order of your columns and whether to add any covering data. Depending on your queries, the order of columns can make your new index more or less useful to other queries and the covering data can cut down access back to the table.
It also seems rather like you're simply making a surrogate key for the true key (date and item number). Why do you need the surrogate? Just use those as your PK.
Alternatively/additionally, look at your data model. The distiction between Item # and sale ID "smells" wrong
a) where is your sales # coming from?
b) what is it used for and why?
c) is it adding any value over having just the sales ID and a date of the sale?
d) if you really do need it, can you calculate it as you need it?

FrOzeN89
Starting Member

5 Posts

Posted - 2013-02-11 : 01:14:54

My knowledge of SQL at the moment is very limited as I've mostly been relying on google. So I'm not really sure the difference between using an INDEX opposed to a CONSTRAINT. I just wanted something to ensure some data integrity.

I have the data in PDF's (only way I can get it) which I'm extracted and putting into a relational database so I can better query it. There are about 40,000 items with sales for about 5,000 each day. So I have one table for all the item's descriptions, and one table for the Volume of each Item sold on that Date. In my example I have truncated a bunch of fields after the Volume as well.

ItemNumber is basically a variant of a barcode. In my database the field is called something else, I just changed it for this example. As for Sales_Id, that's just a surrogate key I've been using with all my tables, effectively tablename_Id on AUTOINCREMENT (using SQLite).

jackv
Master Smack Fu Yak Hacker

2179 Posts

Posted - 2013-02-11 : 01:51:43

A unique constraint and a unique index are very similar. Both create an index to maintain the uniqueness. From an optimizer - execution plan perspective they create the same cost plan.
The main difference is the UNIQUE CONSTRAINT is part of the logical design - whereas the unique index is part of the physical design

Jack Vamvas
--------------------
http://www.sqlserver-dba.com

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2013-02-11 : 02:42:17

quote:
Originally posted by FrOzeN89

My knowledge of SQL at the moment is very limited as I've mostly been relying on google. So I'm not really sure the difference between using an INDEX opposed to a CONSTRAINT. I just wanted something to ensure some data integrity.

I have the data in PDF's (only way I can get it) which I'm extracted and putting into a relational database so I can better query it. There are about 40,000 items with sales for about 5,000 each day. So I have one table for all the item's descriptions, and one table for the Volume of each Item sold on that Date. In my example I have truncated a bunch of fields after the Volume as well.

ItemNumber is basically a variant of a barcode. In my database the field is called something else, I just changed it for this example. As for Sales_Id, that's just a surrogate key I've been using with all my tables, effectively tablename_Id on AUTOINCREMENT (using SQLite).

Putting aside the difference between a constraint & index, your surrogate is making things unnecessarily complicated. I'd get rid of it from both tables.
Product's PK is item number
Sales PK is ItemNumber+TheDate.
Sales FK to Products is as you have it.
Job done. (Though your table might be better named "daily sales" or something and you could consider making it a [materialised] view if it's derived from other data).

FrOzeN89
Starting Member

5 Posts

Posted - 2013-02-11 : 02:51:50

Ok, I'll drop the surrogate key from the Products table and use the ItemNumber. But even between now and yesterday I overlooked that ItemNumber+TheDate are not unique enough and also need to include StoreID. Should the Sales table have (ItemNumber + TheDate + StoreID) as a primary key? I defaulted to surrogate as a safer option initially, then just tried applying some unique filters for data integrity..

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2013-02-11 : 18:10:13

Yes, I would have all 3 columns as a PK if that key identifies what the rest of the data represents/refers to. It starts to get a bit uncomfortable when you get into many columns but in this case you're most likely ok. I can't think why you'd want a FK to this data so you'll probably not run into any dilema as to whether it's too long or not.
Don't forget to FK to your storeID.

FrOzeN89
Starting Member

5 Posts

Posted - 2013-02-12 : 03:20:56

So I just removed all the surrogate keys and replaced the unique values as primary keys, as well as adding the foreign keys for ItemNumber and StoreID.

The database would be full of unnecessary repeated data if I didn't FK these values. In Sales, if I have sales for an ItemNumber every day, that's 365 copies of it's ItemNumber within the database. No point having 365 copies of the description of that Item. I can use the Products table to look up the description if needed. Likewise with StoreID.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources