Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Creating a unique number based on mixed text

Author  Topic 

jordanam
Yak Posting Veteran

62 Posts

Posted - 2007-12-06 : 11:11:15
I have two tables that are nearly identical. They store a UserID, and Address information. The only difference is that one is a rolling history, and the other is one record per user. If a user changes his address, the First table just gets updated. The Second table doesn't get a new record, until I compare records between the first and second tables, after which it theoretically gets an additional record for that user.

I cannot modify the physical structure of Table 1 at all. No triggers, no adding a timestamp field. I have to resort to polling, to check against Table 2 and see if Table 2 has the most current records.

The thought of creating a series of nested Whiles or cursors is incredibly nasty to me. The performance of that wouldn't be very good, and I don't need to be that granular. If any column in a row of Table 1 changes, Table 2 gets a new record, regardless of which column changed. Unfortunately, I also can't use character counts, in the event that only a zip code or the apartment number of an address changed.

I tried concatenating the strings, and then just doing a compare on the concatenated strings of table 1 to the most current concatenated strings of table 2, but I got complaints that query performance would suffer as the table grew, because comparing long strings can be slow.

I do think that creating a hash of sorts is truly the key to making this query work... unless it's not?

My latest experiment involves dumping each string into a function that converts each letter of the alphabet into a different number, and then returns a huge number (I handle spaces and other characters too). I would then simply add/subtract numbers created from each string, and that would yield a unique-enough comparison operator. The problem is that it is returning high numbers, and it just seems too brute-force to be the best solution, or even a good one.

Any thoughts?

jordanam
Yak Posting Veteran

62 Posts

Posted - 2007-12-06 : 12:05:07
This is easy, and I wasn't thinking about it (yay BOL!).

Basically, CASTing the string as a VARBINARY(size), and then re-CASTing that as a bigint, should generally work. I tried to push an arithmetic overflow by just using more characters, but at 714 characters, I still wasn't getting one. Also, since the initial conversion handles all ascii characters, spaces and whatnot won't have any impact.

So, you basically get this:



CAST (CAST(Address as varbinary(400)) as bigint) + CAST (CAST(City as varbinary(100)) as bigint) ...



Performing any mathematical operation on the integers should create a good enough unique number, which when paired with a UserID, should prove to be accurate.
Go to Top of Page
   

- Advertisement -