Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Unexpected problem with query

Author  Topic 

craigwg
Posting Yak Master

154 Posts

Posted - 2010-11-01 : 16:31:06
I have a query that brings back some strange results. It's a very simple query using wildcards:

select [34987X53X4332] from surveytable where [34987X53X4332] like '%E%'


The results are this:


34987X53X4332
2023402
3645287
9869179


I beleive it was Mark Twain who so eloquently stated, "WTF?"

Am I nuts here? This shouldn't be happening, right? Is there some ANSII Character shift that's happening because I'm using wildcards? I've never heard of that before! Can anyone offer any explanation as to why this is happening?

For the record, this is actually happening on a large number of columns in this table. I am cleaning the data and trying to remove scientific numbers from the results. These contain the letter 'e'. I should point out that the datatype in the table is a "real".

Craig Greenwood

Kristen
Test

22859 Posts

Posted - 2010-11-01 : 16:43:28
"datatype in the table is a "real""

You are doing LIKE on a Real DataType column?

That would involve a conversion from Real to Char - on which the LIKE can be performed. The Cast from Real to Char is probably giving you string that look like "1234567E89" - and which will match LIKE '%E%'
Go to Top of Page

craigwg
Posting Yak Master

154 Posts

Posted - 2010-11-01 : 16:54:38
Ahhhhhhh. This is explaining it:

select convert(nvarchar,[34987X53X4332]) from survey_34987 where [34987X53X4332] like '%E%'

So what's happening here? If a datatype is real, that's just a representation of what is really in the column? I am held to the real datatype by the application LimeSurvey. This table is being pulled by a view which is crashing. I knew WHY it was happening and even found several scientific notations in the table. But starting hitting the ones in my example, which is the same problem, just hidden under the real datatypes. At least that's how I see it now.

Thanks for your help!


Craig Greenwood
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-11-01 : 17:02:56
It rather begs the question why you are using LIKE on a numeric column and trying to find "E" ?
Go to Top of Page

craigwg
Posting Yak Master

154 Posts

Posted - 2010-11-01 : 17:08:55
...because I didn't know that "like" involved converting things to strings.

Here is my new problem. I rewrote my query to this:


select [34987X53X4332] from surveytable where isnumeric([34987X53X4332])=0


That pulls no results, which is correct. However on other columns there is legitimate bad data that looks like this:

2.13428e+007

My isnumeric trick identifies that as numeric, which in my application is bad. So my goal is to get rid of these numbers in scientific notation by replacing them with nulls or 0s. (I'll make that call later when the time comes!) Do you know of a way to weed out this format?

Thanks again for your help!


Craig Greenwood
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-11-02 : 03:18:10
"...because I didn't know that "like" involved converting things to strings."

Sorry, didn't mean that. It seemed strange to me that a test for "E" would be performed on a numeric field. Searching for a numeric field that "contains" some substring seemed unusual, and I wondered if there was a different route to solve your task.

IsNumeric() is fragile. IsNumeric expects a char parameter - so isnumeric([MyRealColumn]) will convert the Real value to string, and then test it. Hopefully, given that the original was a Real number!, it will always return TRUE.

But IsNumeric has very loose testing, which generally makes it useless (as you have found). I think it just looks for digits and E (in case of E notation) and some other bits and bobs, rather than determining that the string can actually be converted to a number. Poor show MS, and you've had numerous versions in which to fix it properly

However, your REAL datatype column can only contain real numbers (or NULL perhaps), so you don't need to validate anything that is already contained in that column.

If you have string data that you want to convert into a REAL datatype your choices are more complicated. I think the best you could do is to come up with a series of RegEx-type expressions using LIKE:

MyString NOT LIKE '%[^0-9.E]%'

which will find any that do not contain characters that are not in the set 0-9, "." and E. You then need to check that there is only one ".", only one "E", and if there is an 'E' it is followed by digits and not followed by ".". All rather tedious.

You could use an external RegEx library (COM object, or the like, depending on what version of SQL you are using) which would let you use a proper RegEx, rather than the sawn-off version provided in the SQL Language, and that would get you a much more reliable, and simpler to implement, True/False result.

You might be able to use TRY ... CATCH in SQL - but that would involve processing each row, one-by-one, and it would be much more efficient to use a set-based approach.

If you can handle data validation before the data ever arrives at the database that would be a much better route.

Sorry, probably not much help.
Go to Top of Page
   

- Advertisement -