SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Dealing with bad input data
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

Tomb
Starting Member

10 Posts

Posted - 07/16/2012 :  22:28:11  Show Profile  Reply with Quote
Hello,

I'm trying to sum a column that should be an integer, but there appears to be some bad data where the values are text instead (specifically "a0"), so that when I execute my query I get ERROR: invalid input syntax for double precision: "a0".

I've tried using a cast statement, a case when statement, and excluding these rows with a where statement but I always get the same error. How can I exclude these rows and have the query execute properly? I have billions of rows so cannot manually delete them.

Thanks!

visakh16
Very Important crosS Applying yaK Herder

India
47189 Posts

Posted - 07/16/2012 :  22:30:07  Show Profile  Reply with Quote
use

WHERE field NOT LIKE '%[^0-9]%'

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 07/17/2012 :  00:40:20  Show Profile  Reply with Quote
Still get the same error when adding that to the where clause. I think what I'm going to do is first have a subquery where I cast the field as a varchar and exclude the non-integers, then cast it as an integer again in the main query where I do the sum.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
47189 Posts

Posted - 07/17/2012 :  09:44:11  Show Profile  Reply with Quote
quote:
Originally posted by Tomb

Still get the same error when adding that to the where clause. I think what I'm going to do is first have a subquery where I cast the field as a varchar and exclude the non-integers, then cast it as an integer again in the main query where I do the sum.


dont cast it until you filter out bad data

so it should be like

SELECT CAST(field as...)
FROM
(
select ...
from table
where field NOT LIKE '%[^0-9]%'
)t


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 07/17/2012 :  11:16:21  Show Profile  Reply with Quote
thanks, this worked
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
47189 Posts

Posted - 07/17/2012 :  11:56:17  Show Profile  Reply with Quote
welcome

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 07/17/2012 :  14:56:07  Show Profile  Reply with Quote
quote:
Originally posted by Tomb
I have billions of rows




How long does this query take to run?

Can you post the query itself?

Go to Top of Page

Tomb
Starting Member

10 Posts

Posted - 07/18/2012 :  12:22:50  Show Profile  Reply with Quote
Essentially a modified version of this (field is formatted as a varchar in schema.table):
select x, y, z, sum(cast(field as int)) from
(select x, y, z, field from schema.table where field not like '%[^0-9]%') as a
group by x, y, z order by x, y, z;

I'm actually using Vertica for my data warehouse so given proper indexing/projections and an appropriate where clause it takes maybe 5-10 minutes depending on resources at the time.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.06 seconds. Powered By: Snitz Forums 2000