Author |
Topic |
rachew
Starting Member
4 Posts |
Posted - 2013-08-22 : 04:20:14
|
Morning,I was wondering if anyone can help? In our database we have a field for an age. Its a open field and anyone can type anything into it.So we get 2, 2 years, 2Yrs, 6, 6 months, 6mths, 1.5weeks, 3.5months old, 3 days, 15months etc etc etc.We tried to get it limited to put years and month in etc but will not be done by IT.Is there any code that you guys can help with that will convert all the above to years?Thanks for your help.Rich |
|
bandi
Master Smack Fu Yak Hacker
2242 Posts |
Posted - 2013-08-22 : 04:39:44
|
Why are you storing AGE... Its better to have Date of Birth column with DATE....any way provide us the sample data which you have presently....--Chandu |
|
|
rachew
Starting Member
4 Posts |
Posted - 2013-08-22 : 04:51:59
|
Hi Chandu,Its an emergency call centre, so sometimes the caller doesn't know the DOB.13 Years72Y5778 YUnknown89 Y99 Years17 Years42 Y45 Years31 Years42 Years32 Years9 MonthsUnknown Ra86 YearsUnknown RaUnknown Ra47 Years25 Years89 Years32 Years1 YearsNULL34 Years40 YearsUnknown Ra15 Years50 YearsUnknown Ra58 Years6 WNULLNULLNULL25 Years90 Years81 Years29 YUnknown96 Years95 Years87 Years56 Years69 Years20 Years37 Years62 YearsUnknown Ra81 Years25 Years38 Years77 Years73 Y21 Years28Y36 Years89 Years6 Y90 Years19 Years29 Y24 Years43 Years32 Y39 Years87 YearsUnknown Ra79 Y98 Y80 Y67 Years22 Y46 YUnknownUnknownUnknown Ra34 Years90 Y84 Y7483 Years48 Years4 YearsUnknown Ra80Y92 Y20 Years19 Years89 Y40 YearsUnknown12 Years76 Years79 Y54 YearsUnknown Ra86 Years60 Years78 Years25 Years90 Years64 Years37 Years48 Years55 YUnknown Ra57 Years62 Years26 Years99 Years85 Years49 Y67 Years34 Years55 Years44 Years50 Years72 Y82 Y48 Years57 Years84UnknownUnknown30 Years52 Y22 Years71 Y1.5W52 Years101 Years23 YearsUnknown Ra42 Years32 Years20 Years59 Years20 Y44 Y33 Years72 Years77 Years21 Years85 Years72YR29Y23 Years89 Y80 Years17 Years18 Years18 Years89 Years19 Years89 Y76 Years70 YUnknown Ra20 Years46 Y25 YearsUnknown74 YearsUnknown65 Years49 Years95 Y21 Years93 Y34YUnknown98 Y45 Years20 Years29 Years25 Years80 YearsUnknown Ra90 Years20 Years96 Years89 Y49 Years57 YearsUnknown78 Years50 Years88 Years29 Years57 Y89 Years48 Years96 Years80 Years27 Years25 Years21 Years62YNULL |
|
|
bandi
Master Smack Fu Yak Hacker
2242 Posts |
Posted - 2013-08-22 : 07:24:04
|
How you will get to know the years/months....if there is 20 years 10 months, 30 Y, 12 years, 400 weeks etc.....what should be the output?--Chandu |
|
|
rachew
Starting Member
4 Posts |
Posted - 2013-08-22 : 08:06:39
|
Hi Chandu, the output needs to be years.So anything which is days/weeks/months needs to be in the range of 0-12months as '0'years.Anything which is between month 12-24 then '1' |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-08-22 : 08:24:41
|
There is no clean way to accomplish what you are trying to do. You have to ask, beg, plead, cajole, bribe, threaten or do something else to get your IT guys to restrict the entry into the field so it is always entered in a consistent format. Ideally, like Chandu suggested, have them enter a birthdate.If you want to clean up the existing data it is painful - you will need to do it in multiple steps. See the example below, which is only partial. You can copy and paste this to an SSMS window and run it to see what it does.CREATE TABLE #tmp(randomText VARCHAR(256));INSERT INTO #tmp VALUES('13 Years'),('72Y'),('57'),('78 Y'),('Unknown'),('89 Y'),('99 Years'),('17 Years'),('42 Y'),('45 Years'),('31 Years'),('42 Years'),('32 Years'),('9 Months'),('Unknown Ra'),('86 Years'),('Unknown Ra'),('Unknown Ra'),('47 Years'),('25 Years'),('89 Years'),(''),('32 Years'),('1 Years'),(NULL),('34 Years'),('40 Years'),('Unknown Ra'),('15 Years'),('50 Years'),('Unknown Ra'),('58 Years'),('6 W'),(NULL),(NULL),(NULL),('25 Years'),('90 Years'),('81 Years'),('29 Y'),('Unknown'),('96 Years'),('95 Years'),('87 Years'),('56 Years'),('69 Years'),('20 Years'),('37 Years'),('62 Years'),('Unknown Ra'),('81 Years'),('25 Years'),('38 Years'),('77 Years'),('73 Y'),('21 Years'),('28Y'),('36 Years'),('89 Years'),('6 Y'),('90 Years'),('19 Years'),('29 Y'),('24 Years'),('43 Years'),('32 Y'),('39 Years'),('87 Years'),('Unknown Ra'),('79 Y'),('98 Y'),('80 Y'),('67 Years'),('22 Y'),('46 Y'),('Unknown'),('Unknown'),('Unknown Ra'),('34 Years'),('90 Y'),('84 Y'),('74'),('83 Years'),('48 Years'),('4 Years'),('Unknown Ra'),('80Y'),('92 Y'),('20 Years'),('19 Years'),('89 Y'),('40 Years'),('Unknown'),('12 Years'),('76 Years'),('79 Y'),('54 Years'),('Unknown Ra'),('86 Years'),('60 Years'),('78 Years'),('25 Years'),('90 Years'),('64 Years'),('37 Years'),('48 Years'),('55 Y'),('Unknown Ra'),('57 Years'),('62 Years'),('26 Years'),('99 Years'),('85 Years'),('49 Y'),('67 Years'),('34 Years'),('55 Years'),('44 Years'),('50 Years'),('72 Y'),('82 Y'),('48 Years'),('57 Years'),('84'),('Unknown'),('Unknown'),('30 Years'),('52 Y'),('22 Years'),('71 Y'),('1.5W'),('52 Years'),('101 Years'),('23 Years'),('Unknown Ra'),('42 Years'),('32 Years'),('20 Years'),('59 Years'),('20 Y'),('44 Y'),('33 Years'),('72 Years'),('77 Years'),('21 Years'),('85 Years'),('72YR'),('29Y'),('23 Years'),('89 Y'),('80 Years'),('17 Years'),('18 Years'),('18 Years'),('89 Years'),('19 Years'),('89 Y'),('76 Years'),('70 Y'),('Unknown Ra'),('20 Years'),('46 Y'),('25 Years'),('Unknown'),('74 Years'),('Unknown'),('65 Years'),('49 Years'),('95 Y'),('21 Years'),('93 Y'),('34Y'),('Unknown'),('98 Y'),('45 Years'),('20 Years'),('29 Years'),('25 Years'),('80 Years'),('Unknown Ra'),('90 Years'),('20 Years'),('96 Years'),('89 Y'),('49 Years'),('57 Years'),('Unknown'),('78 Years'),('50 Years'),('88 Years'),('29 Years'),('57 Y'),('89 Years'),('48 Years'),('96 Years'),('80 Years'),('27 Years'),('25 Years'),('21 Years'),('62Y'),(NULL)-- YearsUPDATE #tmp SET RandomText = CASE WHEN ISNUMERIC(REPLACE(RandomText,'Years',''))=1 THEN REPLACE(RandomText,'Years','') ELSE randomText END-- YUPDATE #tmp SET RandomText = CASE WHEN ISNUMERIC(REPLACE(RandomText,'Y',''))=1 THEN REPLACE(RandomText,'Y','') ELSE randomText END-- what else is remaining?SELECT * FROM #tmp WHERE ISNUMERIC(RandomText) = 0 |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2013-08-22 : 19:41:21
|
You could create a table to hold the set of abbreviations you'll accept for time (e.g., ('Yrs', 'Years'), ('Y', 'Years'), ('Mnths', 'Months'), etc) and join it to your raw data based on some string matching logic. That way you could translate the raw data into an age ('24Mnths' -> 2 Years). I can only imagine that you'll need to iterate through this process until you find every variation/permutation/mutation of strings being input by the end user. You have my sympathies...=================================================The cure for anything is salt water -- sweat, tears, or the sea. -Isak Dinesen |
|
|
rachew
Starting Member
4 Posts |
Posted - 2013-08-23 : 03:19:03
|
Thanks very much for your help.I'll try the suggestions and see which works best! |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2013-08-23 : 14:25:03
|
Is it easy? Very clearly NO. But is it THAT difficult? Also NO.For example, the code below pulls out the values for the days and/or weeks and/or months and/or years. To me, CROSS APPLY makes the code easier to write and to understand, and thus to maintain.The last code needed would be to check the values for numeric and add them to get (approximate) years. I need some more details on what you need in order to write that code:? How to treat non-numeric/invalid values ?? Do you want a final decimal #years, such as 3.2 or 0.2? If not, how/when to round? If yes, how many decimal places and how to round ?etc.. --2 years, 2Yrs, 6, 6 months, 6mths, 1.5weeks, 3.5months old, 3 days, 15months etc etc etc.DROP TABLE #agesCREATE TABLE #ages (age_text varchar(100)); INSERT INTO #agesSELECT '13 Years' UNION ALLSELECT '72Y' UNION ALLSELECT '57' UNION ALLSELECT '78 Y' UNION ALLSELECT '1.5W' UNION ALLSELECT '2Yrs' UNION ALLSELECT '3Yrs 6mos 3wks 10days' UNION ALLSELECT '3Yrs6mos3wks10days' UNION ALLSELECT '10days 3years' UNION ALLSELECT '6 6months' UNION ALLSELECT '6mths' UNION ALLSELECT '1.5weeks' UNION ALlSELECT '3.5months old' UNION ALLSELECT '15months' --y yr year year--d day [dy?]--w wk week--m mth month month mnthSELECT age_text, LTRIM(RTRIM(SUBSTRING(age_text, day_literal - day_length, day_length))) AS age_days, LTRIM(RTRIM(SUBSTRING(age_text, week_literal - week_length, week_length))) AS age_weeks, LTRIM(RTRIM(SUBSTRING(age_text, month_literal - month_length, month_length))) AS age_months, LTRIM(RTRIM(SUBSTRING(age_text, year_literal - year_length, year_length))) AS age_years --, *FROM #agesCROSS APPLY ( SELECT CHARINDEX('d', age_text) AS day_literal) AS ca_dayCROSS APPLY ( SELECT CHARINDEX('w', age_text) AS week_literal) AS ca_weekCROSS APPLY ( SELECT CHARINDEX('m', age_text) AS month_literal) AS ca_monthCROSS APPLY ( SELECT CASE WHEN day_literal = 0 OR CHARINDEX('y', age_text) < day_literal THEN CHARINDEX('y', age_text) ELSE CHARINDEX('y', age_text, day_literal + 3) END AS year_literal) AS ca_yearCROSS APPLY ( SELECT CASE WHEN day_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, day_literal - 1)) + ',') - 1 END AS day_length, CASE WHEN week_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, week_literal - 1)) + ',') - 1 END AS week_length, CASE WHEN month_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, month_literal - 1)) + ',') - 1 END AS month_length, CASE WHEN year_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, year_literal - 1)) + ',') - 1 END AS year_length) AS ca_lengths |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-08-23 : 14:55:14
|
quote: Originally posted by ScottPletcher Is it easy? Very clearly NO. But is it THAT difficult? Also NO.For example, the code below pulls out the values for the days and/or weeks and/or months and/or years. To me, CROSS APPLY makes the code easier to write and to understand, and thus to maintain.The last code needed would be to check the values for numeric and add them to get (approximate) years. I need some more details on what you need in order to write that code:? How to treat non-numeric/invalid values ?? Do you want a final decimal #years, such as 3.2 or 0.2? If not, how/when to round? If yes, how many decimal places and how to round ?etc.. --2 years, 2Yrs, 6, 6 months, 6mths, 1.5weeks, 3.5months old, 3 days, 15months etc etc etc.DROP TABLE #agesCREATE TABLE #ages (age_text varchar(100)); INSERT INTO #agesSELECT '13 Years' UNION ALLSELECT '72Y' UNION ALLSELECT '57' UNION ALLSELECT '78 Y' UNION ALLSELECT '1.5W' UNION ALLSELECT '2Yrs' UNION ALLSELECT '3Yrs 6mos 3wks 10days' UNION ALLSELECT '3Yrs6mos3wks10days' UNION ALLSELECT '10days 3years' UNION ALLSELECT '6 6months' UNION ALLSELECT '6mths' UNION ALLSELECT '1.5weeks' UNION ALlSELECT '3.5months old' UNION ALLSELECT '15months' --y yr year year--d day [dy?]--w wk week--m mth month month mnthSELECT age_text, LTRIM(RTRIM(SUBSTRING(age_text, day_literal - day_length, day_length))) AS age_days, LTRIM(RTRIM(SUBSTRING(age_text, week_literal - week_length, week_length))) AS age_weeks, LTRIM(RTRIM(SUBSTRING(age_text, month_literal - month_length, month_length))) AS age_months, LTRIM(RTRIM(SUBSTRING(age_text, year_literal - year_length, year_length))) AS age_years --, *FROM #agesCROSS APPLY ( SELECT CHARINDEX('d', age_text) AS day_literal) AS ca_dayCROSS APPLY ( SELECT CHARINDEX('w', age_text) AS week_literal) AS ca_weekCROSS APPLY ( SELECT CHARINDEX('m', age_text) AS month_literal) AS ca_monthCROSS APPLY ( SELECT CASE WHEN day_literal = 0 OR CHARINDEX('y', age_text) < day_literal THEN CHARINDEX('y', age_text) ELSE CHARINDEX('y', age_text, day_literal + 3) END AS year_literal) AS ca_yearCROSS APPLY ( SELECT CASE WHEN day_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, day_literal - 1)) + ',') - 1 END AS day_length, CASE WHEN week_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, week_literal - 1)) + ',') - 1 END AS week_length, CASE WHEN month_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, month_literal - 1)) + ',') - 1 END AS month_length, CASE WHEN year_literal = 0 THEN 0 ELSE PATINDEX('%[^0-9 .]%', REVERSE(LEFT(age_text, year_literal - 1)) + ',') - 1 END AS year_length) AS ca_lengths
You might need to add something to add the case where just a number without any indicator as to whether it is years, months, or something else is entered. Couple of rows in the OP's examples fall into that category.That is just from casual observation. Who knows what other kind of data/pattern can be there in the data. After all it is free form data, according to the OP. |
|
|
ScottPletcher
Aged Yak Warrior
550 Posts |
Posted - 2013-08-23 : 16:12:05
|
Good point about number alone -- I'd actually meant to include that part but just forgot it.SELECT age_text, LTRIM(RTRIM(SUBSTRING(age_text, day_literal - day_length, day_length))) AS age_days, LTRIM(RTRIM(SUBSTRING(age_text, week_literal - week_length, week_length))) AS age_weeks, LTRIM(RTRIM(SUBSTRING(age_text, month_literal - month_length, month_length))) AS age_months, CASE WHEN year_literal = 0 AND month_literal = 0 AND week_literal = 0 AND day_literal = 0 THEN LTRIM(RTRIM(age_text)) ELSE LTRIM(RTRIM(SUBSTRING(age_text, year_literal - year_length, year_length))) END AS age_years --, *FROM #ages... |
|
|
lazerath
Constraint Violating Yak Guru
343 Posts |
Posted - 2013-08-23 : 18:27:11
|
Scott, great solution! |
|
|
kameswararao polireddy
Starting Member
19 Posts |
Posted - 2013-08-28 : 07:03:51
|
This will work fine when both Number and alphabet(i.e., Days or Months or Years) were mentioned in Age_Column.If only Number is mentioned then you can add the above solution (i.e which is mentioned by ScottPletcher) to my solution. SELECT CASE WHEN Age_Column LIKE '%Y%' THEN Age_Column WHEN Age_Column LIKE '%M%' THEN LEFT((CAST(AGE_COLUMN AS VARCHAR)+'.0') /12.0,3) WHEN Age_Column LIKE '%D%' THEN LEFT((CAST(AGE_COLUMN AS VARCHAR)+'.0') /365.0,3) WHEN Age_Column NOT LIKE ('%Y%','%D%','%M%') THEN Age_Column) ELSE '' END AS 'Years'P.Kameswara rao |
|
|
|
|
|