Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Proper way of linking tables?

Author  Topic 

madmenyo
Starting Member

1 Post

Posted - 2015-05-02 : 10:44:24
Hi,

I am no stranger to Databases, I worked a lot with MySQL but never really cared about proper DB design as long as it worked. Now I am playing with SQL in a ASP.NET project and want to get things done the right way.

Let's say I have a Movies database. My movies can have multiple genres so I set my tables up like this:

[Movies]
MovieID
MovieName
MovieRelease

[Genre] (I just fill this with known genres)
GenreID
GenreName

[GenreLink]
GenreLinkID
MovieID (FK)
GenreID (FK)

Is this the proper way of doing things? The problem with this is when I want to enter a record manually I have to know the ID of the movie and the ID of the Genres of the movie.

And what about naming conventions? By default the identifier is always Id, from my MySQL experience I liked naming it like the table, same goes with other columns. This is my T-SQL code for above tables in VS-2013.

CREATE TABLE [dbo].[Movies] (
[MovieID] INT IDENTITY (1, 1) NOT NULL,
[MovieName] VARCHAR (50) NOT NULL,
[MovieRelease] NUMERIC (18) NOT NULL,
CONSTRAINT [PK_Movies] PRIMARY KEY CLUSTERED ([MovieID] ASC)
);

CREATE TABLE [dbo].[Genre] (
[GenreID] INT IDENTITY (1, 1) NOT NULL,
[GenreName] VARCHAR (50) NULL,
CONSTRAINT [PK_GenreID] PRIMARY KEY CLUSTERED ([GenreID] ASC)
);

CREATE TABLE [dbo].[GenreLink]
(
[GenreLinkID] INT NOT NULL IDENTITY(1,1),
[GenreID] INT NOT NULL,
[MovieID] INT NOT NULL,
CONSTRAINT [PK_GenreLinkID] PRIMARY KEY CLUSTERED([GenreLinkID] ASC),
CONSTRAINT [FK_GenreID] FOREIGN KEY ([GenreID]) REFERENCES Genre([GenreID]),
CONSTRAINT [FK_MOVIEID] FOREIGN KEY ([MovieID]) REFERENCES Movies([MovieID])
)


Doh!!!

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2015-05-02 : 11:07:22
That looks reasonable. I would exclude [GenreLinkID] column. Let the PK of GenreLink be (GenreID, MovieID).
I agree with your naming convention. Object names should not be generic like (id, code, group, description, etc). They should be meaningful.

Be One with the Optimizer
TG
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-05-02 : 14:27:56
quote:
Originally posted by madmenyo

The problem with this is when I want to enter a record manually I have to know the ID of the movie and the ID of the Genres of the movie.


Why is that a problem? I expect that typically the Application would have a picklist (or something similar - e.g. a "searchable list") for MovieID and GenreID

quote:
I liked naming it like the table


We do that, but more rigidly than you have done. You have GenreID in the [Genre] table and also in the [GenreLink] table. We prefix all columns by the "table nickname".

To save typing really long names we use a nickname prefix. Ours is actually three parts - Sub System, Module and Table.

Let's say that we have CRM [Nickanme = CRM], Accounts [ACC], Stock Control [STK] etc. sub systems, and Members [MEM] are part of CRM. Members have Registration [REG]and Address [ADR] records. So our tables would be:

CRM_MEM_REG_Registration
CRM_MEM_ADR_Address

CRM_MEM_REG_Registration has columns (we drop the Sub System prefix on Column Names)

mem_reg_ID
mem_reg_Title
mem_reg_FirstName
mem_reg_LastName
...

CRM_MEM_ADR_Address table has columns:

mem_adr_ID
mem_adr_mem_reg_ID -- FK to CRM_MEM_REG_Registration
mem_adr_Address1
...
mem_adr_City
mem_adr_State
mem_adr_Postcode
etc.

Note that [mem_adr_mem_reg_ID] indicates that it contains a Member Registration ID, but is stored in the Member Address table.

On a JOIN you then have

FROM CRM_MEM_ADR_Address
JOIN CRM_MEM_REG_Registration
ON mem_reg_ID = mem_adr_mem_reg_ID

I can see at a glance that the column(s) on the left of the "ON" statement are from MEM_REG and the ones on the right are from MEM_ADR. Less chance of me joining illogical columns by accident

This harps back to the days when Hungarian Notation was popular ... a principle of Hungarian Notation was that each variable had a prefix indicating its Data Type. Lets say "int" for Integer and "dt" for date and mny for Money. Thus if, in my code, I put

mnyTotalBalance = intEmployeeNumber * dtBirth

quite apart from the fact that the variable / column names are clearly different types of objects! the prefixes describe "Money = Integer * Date" which alerts me to the fact that it is probably wrong as I type it, or as I review the code.

Even if there was an implicit data conversion between those types, taking a principle of Hungarian Notation, I would CAST each one to make it clear that it was deliberately intentional :

mnyTotalBalance = CONVERT(Money, intEmployeeNumber * CONVERT(int, dtBirth))

(Ignore the fact that its a daft example, its just the principle of the naming convention "saving you from yourself" I'm referring to - what we, in my company, call "defensive programming" - its aim is to reduce the chance of typing something wrongly leading to a potentially hard-to-find bug)

Another point, for us, is that by using a prefix for the TABLE then every column has a unique name. If we need to rename a column, or change some characteristic of the column (longer / shorter, different Datatype) we can reliable do a Global Find on the name and be sure we have found all occurrences and reviewed the code to see where it need updating etc.

I can search for "mem_reg_ID" and find all occurences (including [mem_adr_mem_reg_ID]), or I can search for "mem_adr_mem_reg_ID" for the specific column. I can search for "mem_reg_" to find all code that references columns relating to the Member Registration table

Also, we only need table aliases when we include the same table twice in a query. I have read people saying that all tables and all column references should be aliased in all cases. That's fine, but its more typing etc. You example of "GenreID" will require that you alias both tables when you have a JOIN.

My only other point would be where you have three tables, all containing a GenreID column (name). Is there a greater risk (for your system, compared to mine) that you reference X.GenreID when you should have aliased it as Y.GenreID?

For me I would have mov_gen_ID and aaa_bbb_mov_gen_ID and xxx_yyy_mov_gen_ID so when I use each of those in my query it is clear which table they are referring to, so I think that using my "defensive programming" I am less likely to use a column wrongly because the naming convention alerts me when I do something Stooooopid!
Go to Top of Page
   

- Advertisement -