SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Full Name Data in one Colunm
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

lemondash
Posting Yak Master

United Kingdom
159 Posts

Posted - 01/14/2013 :  15:27:23  Show Profile  Click to see lemondash's MSN Messenger address  Reply with Quote
Hello.

I have a table where there is one column which unfortunately has the full name and in some cases has the titles or affiliations also included in the column. There are number of different permutations on how its stored in that one column.

Whats is the best way to tidy this data up and put each part in is correct field. The main issue i'm having is finding the positions of the spaces because i'm assuming that will help split the values up ? Any help or advice would be great.

Examples of data

Column
Dr FirstName Lastname
Miss FirstName MiddleName LastName
FirstName LastName

jimf
Flowing Fount of Yak Knowledge

USA
2869 Posts

Posted - 01/14/2013 :  17:04:30  Show Profile  Reply with Quote
There may be a best way, but there certainly isn't a good way. You'd have to pull it apart piece by piece, making a lot of assumptions. For example, if there's one space, then it's FirstName,LastName. If there's 2 spaces, then if the first part is in ('Dr','Miss', etc) then that's the title, else it's a first name. Even doing this agonizing process while give inconsistent results. For the name Oscar de la Hoya, how you can program that Oscar is the first name and de la Hoya is the last?

Jim

P.S. Shoot the person that designed this table, I won't tell anyone.

Everyday I learn something that somebody else already knew
Go to Top of Page

Jeff Moden
Aged Yak Warrior

USA
649 Posts

Posted - 01/14/2013 :  22:46:52  Show Profile  Reply with Quote
quote:
Originally posted by jimfP.S. Shoot the person that designed this table, I won't tell anyone.


I'll tell everyone. "SEE??? This is what happens to you when you design stupid stuff!"

--Jeff Moden
RBAR is pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row".

First step towards the paradigm shift of writing Set Based code:
"Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

When writing schedules, keep the following in mind:
"If you want it real bad, that's the way you'll likely get it."
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52249 Posts

Posted - 01/14/2013 :  23:42:20  Show Profile  Reply with Quote
quote:
Originally posted by lemondash

Hello.

I have a table where there is one column which unfortunately has the full name and in some cases has the titles or affiliations also included in the column. There are number of different permutations on how its stored in that one column.

Whats is the best way to tidy this data up and put each part in is correct field. The main issue i'm having is finding the positions of the spaces because i'm assuming that will help split the values up ? Any help or advice would be great.

Examples of data

Column
Dr FirstName Lastname
Miss FirstName MiddleName LastName
FirstName LastName


Ask the designer to read on normalisation and first normal form

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

lemondash
Posting Yak Master

United Kingdom
159 Posts

Posted - 01/15/2013 :  04:33:38  Show Profile  Click to see lemondash's MSN Messenger address  Reply with Quote
I agree with all your comments above but due to his poor DB design this person has been kept many miles away from this data. But it does leave me with the problem of tidying it up.

Is there away i could count the spaces with in the column and based on the result split the column up. Using this theory will clean most of the mess up.

Go to Top of Page

lemondash
Posting Yak Master

United Kingdom
159 Posts

Posted - 01/15/2013 :  05:15:34  Show Profile  Click to see lemondash's MSN Messenger address  Reply with Quote
This is what i have put together so far.

SELECT vContact ,
PARSENAME(REPLACE(vContact, ' ', '.'), 2) AS Firstname ,
PARSENAME(REPLACE(vContact, ' ', '.'), 1) AS LastName ,
CASE WHEN vContact LIKE 'Miss%' THEN 'Miss'
WHEN vContact LIKE 'Dr%' THEN 'Dr'
WHEN vContact LIKE 'Mrs%' THEN 'Mrs'
WHEN vContact LIKE 'Mr%' THEN 'Mr'
WHEN vContact LIKE 'Ms%' THEN 'Ms'
END AS [S]
FROM table1
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52249 Posts

Posted - 01/15/2013 :  05:39:50  Show Profile  Reply with Quote
quote:
Originally posted by lemondash

I agree with all your comments above but due to his poor DB design this person has been kept many miles away from this data. But it does leave me with the problem of tidying it up.

Is there away i could count the spaces with in the column and based on the result split the column up. Using this theory will clean most of the mess up.




see two methods here to split long string based on delimiter

http://visakhm.blogspot.in/2013/01/delimited-string-split-xml-parsing.html

http://visakhm.blogspot.in/2010/02/parsing-delimited-string.html

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

jimf
Flowing Fount of Yak Knowledge

USA
2869 Posts

Posted - 01/15/2013 :  07:02:31  Show Profile  Reply with Quote
SELECT LEN(vContact) - LEN(REPLACE(vContact,' ','')) will tell you how many spaces you have.

Jim

Everyday I learn something that somebody else already knew
Go to Top of Page

lemondash
Posting Yak Master

United Kingdom
159 Posts

Posted - 01/16/2013 :  04:13:33  Show Profile  Click to see lemondash's MSN Messenger address  Reply with Quote
Thank you for all your help.
Go to Top of Page

erikhaselhofer
Starting Member

30 Posts

Posted - 01/17/2013 :  15:19:22  Show Profile  Reply with Quote
I might be too late but I would propose, and see if it got anywhere, some other ideas.

1. Don't fix it. Instead, rename the column, to say lastname, and treat the output differently if the firstname field is NULL. Going forward the interface can handle new accounts correctly and be backwards compatible with the junk. It might cause a few minor problems with reporting, an interface tweak or two, and maybe some work with mailing list but it could be done. You probably won't win this one.

2. Add a flag to the user table to the effect of "forceUpdate", set it to yes, and require all users, on login, to update their account. Assuming the interface is being changed to reflect a new database design, you could simply have the users fix the problem. You could win this one.

3. Put it in an Excel spreadsheet and toss the problem down the line. Maybe there's someone with some free time on their hands who can just plow it over a week. You might win this one too.

I don't know the source of your data, it's purpose, or what you are trying to do going forward, but the odds are very good that a chunk of your data is junk. Even in rock-solid, tight businesses that can often be the case. And given your situation I'd bet you have a lot of junk, but, I could be wrong.

So, before I did anything, I'd run some queries to try and estimate how much work I actually had. Sometimes you can make a business case for not fixing the past.

Go to Top of Page

erikhaselhofer
Starting Member

30 Posts

Posted - 01/17/2013 :  15:39:25  Show Profile  Reply with Quote
Upon further thought, trying to tweak the users to update the data could be a bit of a loopy solution as having "Mr. John Jones" show up in a single column could cause people to scratch their heads a bit. You probably wouldn't want to do that in this case.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000