Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Full Name Data in one Colunm

Author  Topic 

lemondash
Posting Yak Master

159 Posts

Posted - 2013-01-14 : 15:27:23
Hello.

I have a table where there is one column which unfortunately has the full name and in some cases has the titles or affiliations also included in the column. There are number of different permutations on how its stored in that one column.

Whats is the best way to tidy this data up and put each part in is correct field. The main issue i'm having is finding the positions of the spaces because i'm assuming that will help split the values up ? Any help or advice would be great.

Examples of data

Column
Dr FirstName Lastname
Miss FirstName MiddleName LastName
FirstName LastName

jimf
Master Smack Fu Yak Hacker

2875 Posts

Posted - 2013-01-14 : 17:04:30
There may be a best way, but there certainly isn't a good way. You'd have to pull it apart piece by piece, making a lot of assumptions. For example, if there's one space, then it's FirstName,LastName. If there's 2 spaces, then if the first part is in ('Dr','Miss', etc) then that's the title, else it's a first name. Even doing this agonizing process while give inconsistent results. For the name Oscar de la Hoya, how you can program that Oscar is the first name and de la Hoya is the last?

Jim

P.S. Shoot the person that designed this table, I won't tell anyone.

Everyday I learn something that somebody else already knew
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2013-01-14 : 22:46:52
quote:
Originally posted by jimfP.S. Shoot the person that designed this table, I won't tell anyone.


I'll tell everyone. "SEE??? This is what happens to you when you design stupid stuff!"

--Jeff Moden
RBAR is pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row".

First step towards the paradigm shift of writing Set Based code:
"Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

When writing schedules, keep the following in mind:
"If you want it real bad, that's the way you'll likely get it."
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-01-14 : 23:42:20
quote:
Originally posted by lemondash

Hello.

I have a table where there is one column which unfortunately has the full name and in some cases has the titles or affiliations also included in the column. There are number of different permutations on how its stored in that one column.

Whats is the best way to tidy this data up and put each part in is correct field. The main issue i'm having is finding the positions of the spaces because i'm assuming that will help split the values up ? Any help or advice would be great.

Examples of data

Column
Dr FirstName Lastname
Miss FirstName MiddleName LastName
FirstName LastName


Ask the designer to read on normalisation and first normal form

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

lemondash
Posting Yak Master

159 Posts

Posted - 2013-01-15 : 04:33:38
I agree with all your comments above but due to his poor DB design this person has been kept many miles away from this data. But it does leave me with the problem of tidying it up.

Is there away i could count the spaces with in the column and based on the result split the column up. Using this theory will clean most of the mess up.

Go to Top of Page

lemondash
Posting Yak Master

159 Posts

Posted - 2013-01-15 : 05:15:34
This is what i have put together so far.

SELECT vContact ,
PARSENAME(REPLACE(vContact, ' ', '.'), 2) AS Firstname ,
PARSENAME(REPLACE(vContact, ' ', '.'), 1) AS LastName ,
CASE WHEN vContact LIKE 'Miss%' THEN 'Miss'
WHEN vContact LIKE 'Dr%' THEN 'Dr'
WHEN vContact LIKE 'Mrs%' THEN 'Mrs'
WHEN vContact LIKE 'Mr%' THEN 'Mr'
WHEN vContact LIKE 'Ms%' THEN 'Ms'
END AS [S]
FROM table1
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-01-15 : 05:39:50
quote:
Originally posted by lemondash

I agree with all your comments above but due to his poor DB design this person has been kept many miles away from this data. But it does leave me with the problem of tidying it up.

Is there away i could count the spaces with in the column and based on the result split the column up. Using this theory will clean most of the mess up.




see two methods here to split long string based on delimiter

http://visakhm.blogspot.in/2013/01/delimited-string-split-xml-parsing.html

http://visakhm.blogspot.in/2010/02/parsing-delimited-string.html

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

jimf
Master Smack Fu Yak Hacker

2875 Posts

Posted - 2013-01-15 : 07:02:31
SELECT LEN(vContact) - LEN(REPLACE(vContact,' ','')) will tell you how many spaces you have.

Jim

Everyday I learn something that somebody else already knew
Go to Top of Page

lemondash
Posting Yak Master

159 Posts

Posted - 2013-01-16 : 04:13:33
Thank you for all your help.
Go to Top of Page

erikhaselhofer
Starting Member

30 Posts

Posted - 2013-01-17 : 15:19:22
I might be too late but I would propose, and see if it got anywhere, some other ideas.

1. Don't fix it. Instead, rename the column, to say lastname, and treat the output differently if the firstname field is NULL. Going forward the interface can handle new accounts correctly and be backwards compatible with the junk. It might cause a few minor problems with reporting, an interface tweak or two, and maybe some work with mailing list but it could be done. You probably won't win this one.

2. Add a flag to the user table to the effect of "forceUpdate", set it to yes, and require all users, on login, to update their account. Assuming the interface is being changed to reflect a new database design, you could simply have the users fix the problem. You could win this one.

3. Put it in an Excel spreadsheet and toss the problem down the line. Maybe there's someone with some free time on their hands who can just plow it over a week. You might win this one too.

I don't know the source of your data, it's purpose, or what you are trying to do going forward, but the odds are very good that a chunk of your data is junk. Even in rock-solid, tight businesses that can often be the case. And given your situation I'd bet you have a lot of junk, but, I could be wrong.

So, before I did anything, I'd run some queries to try and estimate how much work I actually had. Sometimes you can make a business case for not fixing the past.

Go to Top of Page

erikhaselhofer
Starting Member

30 Posts

Posted - 2013-01-17 : 15:39:25
Upon further thought, trying to tweak the users to update the data could be a bit of a loopy solution as having "Mr. John Jones" show up in a single column could cause people to scratch their heads a bit. You probably wouldn't want to do that in this case.
Go to Top of Page
   

- Advertisement -