How to ignore cast errors

andreeroos
SQL-Fighter

I was trying to cast a string to number but got an error since the field contained some non-numeric characters or where null.
I had to extract the integer through regular expression to get it to work. Is there any other way to use regular cast and just ignore all unsucceful tries?

 

 

TO_NUMBER(REGEXP_SUBSTR(strNumber,'[0-9]*'))

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

exa-Aleksandr
Team Exasol
Team Exasol

Hi @andreeroos ,

You might want to give a try to our family of IS_% functions:

with
to_clean(col1) as(
	SELECT
		*
	from
		values ('1'), ('1a'), ('2.5'), ('3,6')
)
select
	tc.col1
	, case
		when is_number(tc.col1) then to_number(tc.col1)
	end as col1_num
from
	to_clean tc
;
/*
COL1|COL1_NUM|
----+--------+
1   |     1.0|
1a  |        |
2.5 |     2.5|
3,6 |        |
*/

View solution in original post

7 REPLIES 7

mwellbro
Xpert

Hi @andreeroos ,

but what would you suggest would be the resulting datatype of the column where you use the cast ?
The cast would dicatate it to be a number, the ignored values would still be strings so that wouldn´t fit - filtering out rows where
the cast can´t be applied would be horrendous since in that scenarion your SELECT clause would act as a filter which
I´m pretty sure goes against some part of SQL-standard, let alone intuition 😉

Casting a NULL to a certain data type is possible:

WITH base AS (
          SELECT 1 AS my_pk,'1234' AS num_as_txt
UNION ALL SELECT 2 AS my_pk,'x1234' AS num_as_txt
UNION ALL SELECT 3 AS my_pk,NULL AS num_as_txt
)
SELECT 
--this will fail due to alpha-nums in data
--cast(num_as_txt AS DECIMAL(10,0)) AS only_num,
cast(regexp_substr(num_as_txt,'[0-9]*') AS DECIMAL(10,0)) AS only_num,
a.* FROM base a;

 

mwellbro_0-1637332509776.png


So it´s probably best to "clean" your data before cast´ing it.

Cheers,
Malte

andreeroos
SQL-Fighter

Sorry, I might have been a bit unclear. The data I have is known to sometimes have errors. It is a file that is downloaded and exposed through a view that is based on a UDF and for simplicity all fields in the view are CHAR. The file it is based on is a bit dirty in terms of data quality and it would be OK from an anlytical perspective if casted numeric values were turned into NULL instead.

This view is accessed through a 3rd party tool,Tableau, and when I try to change datatype in Tableau to a numeric data type it crashes. I don't expect end users to understand regular expressions (the Tableau Exasol driver doesn't even support regular expressions) but some easy passthrough cast where cast errors were allowed and turned them into NULL instead would possibly work.

I'm mostly trying to understand what possibilities that exist when it comes to casting and errors.

mwellbro
Xpert

Ah, ok - and taking care of the "cleansing" in the VIEW/UDF-side of things so that only numeric or null values are exposed to Tableau isn´t an option ?
In terms of "what a cast can do" I don´t think you´ll get it to behave in a way that says "cast if possible, else return null"...but maybe someone from the exa-* folks has a card up their sleeves we could play here ? Let´s see if they chime in here 🙂

Cheers,
Malte

exa-Aleksandr
Team Exasol
Team Exasol

Hi @andreeroos ,

You might want to give a try to our family of IS_% functions:

with
to_clean(col1) as(
	SELECT
		*
	from
		values ('1'), ('1a'), ('2.5'), ('3,6')
)
select
	tc.col1
	, case
		when is_number(tc.col1) then to_number(tc.col1)
	end as col1_num
from
	to_clean tc
;
/*
COL1|COL1_NUM|
----+--------+
1   |     1.0|
1a  |        |
2.5 |     2.5|
3,6 |        |
*/

View solution in original post

andreeroos
SQL-Fighter

IS_NUMBER works fine. I'm looking at a way of solving this in a way that assumes that the user (in Tableau) has no knowledge of Exasol or SQL at all so I'm trying to mimic default Tableau behaviour (using Tableau's own database Hyper) as much as possible. In Tableau failed casts are automatically converted to NULL values.

The best solution seems to be that I use the suggested function in a passthrough function in Tableau and apply it as a filter:

RAWSQL_BOOL("is_number(%1)",[X])

 

andreeroos
SQL-Fighter

There should perhaps be an option in the ODBC driver having this as a default behaviour?

exa-Chris
Community Manager
Community Manager
Perfect for an IDEA in the Ideation Area 😉
Connecting Customers, Partners, Prospects and Exasolians is my passion. Apart from that I cycle, listen to music, and try to understand what all those technical discussions really mean...