General question about relationships and joins

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

Old Forums

CLOSED - General SQL Server

General question about relationships and joins

Author

Topic

drewsalem
Constraint Violating Yak Guru

304 Posts

Posted - 2006-11-07 : 07:08:48

Scenario:
If I am importing data from database A into indentical tables in database B.

Database A has sound data integrity with all its relationships setup and put in place.

Database B has no relationships between the tables as it will only be used to select from.

Is there any benefit in replicating the realtionships currently between tables in database A to tables in database B before doing my Selects and Joins in database B? Or is this pointless as this was all done in database A before the data was brought across.

Thanks

Drew

Kristen
Test

22859 Posts

Posted - 2006-11-07 : 07:11:18

The Query Optimiser might use the knowledge that there is a FK to choose a better query plan.

Anyways, I've always worked on the belief that I hoped that it would, but I might have been deluded of course!

I still think its a good idea - folk unfamiliar with the DB will be able to use visual-tools to see connections between tables etc. when they make their queries - that may not be relevant in your situation of course.

Kristen

drewsalem
Constraint Violating Yak Guru

304 Posts

Posted - 2006-11-07 : 07:17:51

Thanks, Kristen.

Drew

Kristen
Test

22859 Posts

Posted - 2006-11-07 : 07:43:43

Importing / freshening the data is more of a pain though, of course ... (even if you have a DROP ... CREATE set of FK scripts you still have to keep them in step with any changes in the DDL of the parent database ...)

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-11-07 : 08:28:17

If you're only using database B for reporting purposes I would consider dropping the constraints and foreign keys and everything, except indexes of course. And I have been under the impression for some time that the query optimizer only uses statistics and indexes to give the best query performance, but I could be off on this.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"

pootle_flump

1064 Posts

Posted - 2006-11-07 : 08:36:37

I am under the same impression.

Most relevent passage I can find in BoL:

quote:
Extracted from BoL by PootsForeign key columns are often used in join criteria when the data from related tables is combined in queries by matching the column(s) in the FOREIGN KEY constraint of one table with the primary or unique key column(s) in the other table. An index allows Microsoft® SQL Server™ 2000 to find related data in the foreign key table quickly. However, creating this index is not a requirement. Data from two related tables can be combined even if no PRIMARY KEY or FOREIGN KEY constraints are defined between the tables, but a foreign key relationship between two tables indicates that the two tables have been optimized to be combined in a query that uses the keys as its criteria. For more information about using FOREIGN KEY constraints with joins, see Join Fundamentals.

That suggests to me that the relationship itself is moot to query optimisation.

EDIT - formatted the cut & paste job

Kristen
Test

22859 Posts

Posted - 2006-11-07 : 09:06:02

"a foreign key relationship between two tables indicates that the two tables have been optimized to be combined in a query that uses the keys as its criteria"

Hahahaha ....

Or maybe not actually ...

I was thinking of the SP4 issue where performance drops like a stone when you JOIN on columns of different types, but I suppose SQL Server would NOT let one create an FK between differing datatypes in the first place.

"EDIT - formatted the cut & paste job"

Its a bit wide Pootle! [quote] might be a better animal?

I did a quick Google - not sure any of this is relevant, but:

http://msdn2.microsoft.com/en-US/library/ms177416.aspx

quote:

When the DATE_CORRELATION_OPTIMIZATION database option is set to ON, SQL Server maintains correlation statistics between any two tables in the database that have datetime columns and are linked by a one-column foreign key constraint. By default, this option is set to OFF.

SQL Server uses these correlation statistics together with the date restriction specified in the query predicate to infer that additional restrictions can be added to the query without changing the result set. The query optimizer uses these inferred conditions when it chooses a query plan. A faster query plan may result, because the added restrictions let SQL Server read less data when it is processing the query. Performance is also improved when both tables have clustered indexes defined on them, and their datetime columns for which correlation statistics are maintained are the first or only key of the clustered index.

http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

quote:

Inferred date correlation constraints: By enabling the DATE_CORRELATION_OPTIMIZATION database setting, you can cause SQL Server to maintain information about the correlation between datetime fields across a pair of tables linked by a foreign key. This information is used to allow implied predicates to be determined for some queries. The information is not used directly for selectivity estimation or costing by the optimizer, so it is not statistics in the strictest sense, but it is closely related to statistics because it is auxiliary information used to help obtain a better query plan.

http://www.microsoft.com/technet/abouttn/flash/tips/tips_122104.mspx

quote:

if you run a graphical showplan from the Query menu in Query Analyzer and examine the results (not shown in this article), you see that the query never checked t1. SQL Server did only a table scan of t1.

So how can SQL Server return a correct existence check? Simple: the foreign key constraint tells SQL Server that a t1 row must exist to match each t2 row. The example data set is small, but you can imagine the substantial processing savings in similar queries on larger data sets if SQL Server can eliminate a table from the join.

(and interestingly goes on to say: " recently learned that using NOCHECK when you create foreign key constraints can wreck this performance trick by returning incorrect results. Listing 3*** shows code that creates two tables and populates t1 with a value of 1 and t2 with a value of 2. The code includes a foreign key constraint with NOCHECK for t2. Run Listing 3, and then run Listing 2 again; you'd expect the query to return no matching rows because a value of 1 doesn't equal a value of 2. Bad news: SQL Server says that the t2 row matches the t1 row.")

http://www.sql-server-performance.com/tuning_joins.asp

quote:

Whether an index on a foreign key has either high or low selectivity, an index on a foreign key can be used by the Query Optimizer to perform a merge join on the tables in question.

Kristen

pootle_flump

1064 Posts

Posted - 2006-11-07 : 09:17:31

quote:
Originally posted by Kristen
"EDIT - formatted the cut & paste job"

Its a bit wide Pootle! [quote] might be a better animal?

Oops - I forgot my laptop is of the widescreen variety

I will revisit.

pootle_flump

1064 Posts

Posted - 2006-11-07 : 09:28:03

Coo - interesting. Link three is potentially the most interesting but does the conclusion extend to queries that are actually any use I wonder....

Kristen
Test

22859 Posts

Posted - 2006-11-07 : 10:02:33

"does the conclusion extend to queries that are actually any use"

Snap!

Likewise single-date-column-foreign-key - yeah, never used one of those in my life, nor likely to!

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources