Return to Comparing Tables
Comparing Tables
Written by Bill Graziano on 07 January 2002
We're in the process of rolling out a new version of a SQL Server-based software package at work. I already have numerous scripts that load tables and I was trying to find out which ones would break. I needed an easy way to compare tables. There are some packages out there that do this but budgets are kind of tight where I work right now. And since I thought it would make a good article I decided to write my own.
The INFORMATION_SCHEMA Views
I started with the Information_Schema views. We answered a question that briefly described how to use these views. Garth also wrote an article that used them to create ADO parameters.
For this article we're mainly concerned with the COLUMNS table. Running the following SELECT statement gives a rough idea of how information is stored in this table.
Select TABLE_CATALOG = Left(TABLE_CATALOG, 10),
TABLE_SCHEMA = Left(TABLE_SCHEMA, 10),
TABLE_NAME = Left(TABLE_NAME, 10),
COLUMN_NAME = Left(COLUMN_NAME, 20)
From pubs.Information_Schema.Columns
Where Table_Name = 'authors'
-- Results
TABLE_CATALOG TABLE_SCHEMA TABLE_NAME COLUMN_NAME
------------- ------------ ---------- --------------------
pubs dbo authors au_id
pubs dbo authors au_lname
pubs dbo authors au_fname
pubs dbo authors phone
pubs dbo authors address
pubs dbo authors city
pubs dbo authors state
pubs dbo authors zip
pubs dbo authors contract
In the actual tables these fields are each 128 characters long and it's a pain to scroll through the results. It's fairly easy to see which columns are the database name, object owner, table name and column name. We're going to work with the following subset of data:
Select COLUMN_NAME = Left(COLUMN_NAME, 20),
DATA_TYPE = Left(DATA_TYPE, 10),
CHARACTER_MAXIMUM_LENGTH as CHAR_MAX_LEN,
NUMERIC_PRECISION AS NUM_PRECISION,
NUMERIC_SCALE AS NUM_SCALE
From pubs.Information_Schema.Columns
Where Table_Name = 'authors'
--Results
COLUMN_NAME DATA_TYPE CHAR_MAX_LEN NUM_PRECISION NUM_SCALE
-------------------- ---------- ------------ ------------- -----------
au_id varchar 11 NULL NULL
au_lname varchar 40 NULL NULL
au_fname varchar 20 NULL NULL
phone char 12 NULL NULL
address varchar 40 NULL NULL
city varchar 20 NULL NULL
state char 2 NULL NULL
zip char 5 NULL NULL
contract bit NULL 1 0
These columns define the data type of a column. I'd encourage you to run a SELECT * on the table and see all the columns listed.
To test this script I created copy of pubs and called it pubs_mod. Then I made a few small changes to the authors table so I would have a table that had changed. If you look at the SELECT statements above you'll notice that these INFORMATION_SCHEMA tables are inside each database. If I want to compare objects in different databases I'll need to do cross-database joins. A simple query to find the new columns in my authors table looks like this:
Select COLUMN_NAME
From pubs_mod.INFORMATION_SCHEMA.Columns
Where TABLE_NAME = 'authors'
AND COLUMN_NAME NOT IN (
SELECT COLUMN_NAME
FROM pubs.INFORMATION_SCHEMA.Columns
Where TABLE_NAME = 'authors' )
Since I can't put a fully qualified table name in a variable, I can't pass in database name as a parameter to a stored procedure -- unless I write dynamic SQL. Unfortunately lazy DBA's rarely write dynamic SQL and I'm definitely a lazy DBA! So I created this view in master:
Create view vw_schema_columns
as
Select *
from pubs.information_schema.columns
UNION ALL
Select *
from pubs_mod.information_schema.columns
Since the TABLE_CATALOG column identifies the source database this is a handy way to have all the information I need in one place. As an aside, you should be able to modify the script to use a linked server and compare tables on different servers. I could have created it anywhere but I knew master wasn't going away and that's also where I'm going to put my stored procedure when I'm done with it.
Parsing Names
I want to pass this procedure the names of two tables and have it compare them. They can be in different databases and have different owners. You'd think that I'd have to pass in three parameters per table -- database, owner and table name. There's a handy function is SQL Server called PARSENAME that will really help us out here. PARSENAME takes a name in the form Server.Database.Owner.Table (or Database..Table, etc.) and breaks out the various components. It doesn't check if they actually exist, it just returns the parts. Now I can call my procedure like this:
sp_compare_tables 'authors', 'pubs_mod..authors'
Either table can be as qualified as I need it to be. In my code the default database is the current database and the default owner is dbo. You can easily change the script to change those defaults.
Datatypes
If you look at the SELECT statement above that returned the datatypes you can see what datatypes the various columns are but it certainly doesn't look very nice. I wrote a user defined function to clean them up a little bit. The main part is this SELECT statement:
Select @v1 = CASE RTrim(@DataType)
WHEN 'char' THEN 'char(' + convert(varchar, @CharLength) + ')'
WHEN 'nchar' THEN 'char(' + convert(varchar, @CharLength) + ')'
WHEN 'varchar' THEN 'char(' + convert(varchar, @CharLength) + ')'
WHEN 'nvarchar' THEN 'char(' + convert(varchar, @CharLength) + ')'
WHEN 'decimal' THEN 'decimal(' + convert(varchar, @Precision) + ',' +
convert(varchar, @Scale) + ')'
WHEN 'numeric' THEN 'numeric(' + convert(varchar, @Precision) + ',' +
convert(varchar, @Scale) + ')'
ELSE RTRIM(@DataType)
END
The only datatypes I really need to handle are char, varchar, decimal and numeric. You can read more about CASE in The Case for CASE. Updating my SELECT statement above with this yields:
Select COLUMN_NAME = Left(COLUMN_NAME, 20),
CONVERT(VARCHAR(64), master.dbo.fn_display_datatype(DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION,
NUMERIC_SCALE) ) AS DATATYPE_Description
From Information_Schema.Columns
Where Table_Name = 'authors'
-- Results
COLUMN_NAME DATATYPE_Description
-------------------- ------------------------------
au_id char(11)
au_lname char(40)
au_fname char(20)
phone char(12)
address char(40)
city char(20)
state char(2)
zip char(5)
contract bit
I think that looks much cleaner.
Putting it all Together
The final script is a little long to put in the article. You can download it here. You run it like this:
use pubs
go
sp_compare_tables 'authors', 'pubs_mod..authors'
go
-- Results
Change COLUMN_NAME DATATYPE_Description
------- -------------------------------- ---------------------------------
New marketingcode char(3)
New c_int int
New c_smalldatetime smalldatetime
New c_float float
New c_nvarchar char(32)
New c_numeric numeric(7,2)
New c_decimal decimal(11,9)
New c_bit bit
Removed phone char(12)
Changed zip char(5) -> char(9)
There are other attributes of columns available in INFORMATION_SCHEMA.Columns that I didn't use. They include COLUMN_DEFAULT, IS_NULLABLE, CHARACTER_SET_NAME, COLLATION_NAME and quite a few others. You should be able to modify this script to handle those if you need them. There is also quite a bit of additional information available in other INFORMATION_SCHEMA tables you can use.
Since I put this in master and prefixed it with "sp_" I can call it from any database. I just need to make sure that all my databases are in my view.
|