CAST ignored in virtual schema

SebastiaanP
Contributor

I'm running two select statements on virtual schema's. Since they're joined in a union and don't all have the same columns, I need a column with '' or NULL values. This results in an error: SQL Error [0A000]: Feature not supported: datatypes are not compatible for Union (Session: 1715323572262273024)

The '' or NULL column appears to be a boolean.

Changing this to the desired data type using: CAST( '' AS VARCHAR(18)) results in exactly the same error.

Running the statement separately shows that the CAST is ignored. It appears to happen when querying on virtual schema's. Doing the same on regular schema's or without referring to objects does not result in the CAST being ignored.

Is there a way to make my join work without creating additional objects?

1 ACCEPTED SOLUTION

Accepted Solutions

exa-GeorgD
Team Exasol
Team Exasol

Hi @SebastiaanP,
thanks for raising this question in the community 👍.
I could reproduce the error with an Exasol virtual schema dialect. For this dialect I found the following two potential workarounds:
1.

 

ALTER VIRTUAL SCHEMA VS_20_HDA_WINLINE_CCVBI SET EXCLUDED_CAPABILITIES = 'LITERAL_NULL';

 


2.

 

 

select 
	C_GLACCOUNTNO 		as General_ledger_account_code
	, C_GLACCOUNTNAME 	as General_ledger_account
	, (SELECT CAST('' AS VARCHAR(18)))
	, 'Winline' 		as Source_application
	,1 AS TEST1
	,CAST(1 AS VARCHAR(10)) CASTTEST1
	,'1' AS TEST2
	,CAST('1' AS VARCHAR(10)) CASTTEST1
	,11 AS TEST3
	,CAST(11 AS VARCHAR(10)) CASTTEST3
	,TRUE AS TEST4
	,CAST(TRUE AS VARCHAR(10)) CASTTEST4
from
	VS_20_HDA_WINLINE_CCVBI.V_CWLGLACCOUNT_ACTUAL

 


Solving this issue is not trivial. As you can see in the pushdown results of EXPLAIN VIRTUAL, the Exasol compiler removes the cast in the query. This is an optimization that happens quite early and before the handling of virtual schemas. For queries without virtual schemas this reduces the execution time and is no issue. In the virtual schema scenario, the pushdown contains a null_literal without datatype. The remote database of the virtual schema then returns a NULL with an arbitrary JDBC Type. This can vary based on the dialect. For the Exasol virtual schema dialect this is Boolean. In theory, the Exasol compiler can reinsert the cast to the desired type. But while this is possible in your scenario, it is not possible for complex expressions. For that, the compiler needs exact types from the JDBC driver.

Thus, in a nutshell, this is https://community.exasol.com/t5/ideas/1-1-type-reporting-switch-in-jdbc-driver-migrated/idi-p/4829. Feel free to vote on the issue ;).

I hope the workarounds are sufficient for the time being until we have this feature in the JDBC driver.

Best wishes
Georg

 

View solution in original post

9 REPLIES 9

mwellbro
Xpert

Hi @SebastiaanP ,

are you sure it´s because of the column you extended to match the width in your union ?
Perhaps you could share the results of a "desc table-name" of the virtual tables that are union´ed together ?

Also, maybe take a look at : explain virtual select * from <something-virtual-goes-here>;
, this should give you the exact pushdown sql and could probably confirm if the issue is in there somewhere.

Is the virtual schema EXA, ORA or JDBC based ? And could you share the query so we could see how it´s layered together ( where is the union, are there some shenanigans around it and so forth ) ?

Sorry for all the questions, hope next time around I provide are more answers than questions 😉

Cheers,
Malte

exa-SebastianB
Team Exasol
Team Exasol

Did you check the query generated by the VS adapter using EXPLAIN VIRTUAL?
https://docs.exasol.com/sql/explain_virtual.htm

exa-SebastianB
Team Exasol
Team Exasol

Also it depends on the capabilities defined by a VS dialect, which features are supported. Which dialect are you using?

ronaldvr
SQL-Fighter

As a colleague who manages the server I can reply to this. This is what we currently run (on 7.0.13)

The query is run from one exasol server to another via jdbc

 

2021-11-03 16_46_19-BucketFS Explorer.pngA

SebastiaanP
Contributor

I'm not how to acquire the capabilities in of the VS dialect. Are these shown as the getCapabilities in the JSON? In that case, can you help me interpret them? (file attached)

Things I've found using the EXPLAIN VIRTUAL are:

The pushdownRequest selectlist just gives a literal null type on the third column. The data type is listed in the involvedTables section, but not present in the pushdownRequest. It's just a literall null type column in the pushdownRequest. As an additionel test I tried some extra casts. These are all casted as desired in the pushdownRequest as opposed to the empty column (see attachment). 

select 
	C_GLACCOUNTNO 		as General_ledger_account_code
	, C_GLACCOUNTNAME 	as General_ledger_account
	, CAST('' AS VARCHAR(18))	AS General_ledger_account_type --Sebastiaan: CAST ingevoegd ivm BOOLEAN UNION VARCHAR18, regel uit productie (S_35_PML_CCV.VD_GLACCOUNT): , ''	AS General_ledger_account_type
	, 'Winline' 		as Source_application
	,1 AS TEST1
	,CAST(1 AS VARCHAR(10)) CASTTEST1
	,'1' AS TEST2
	,CAST('1' AS VARCHAR(10)) CASTTEST1
	,11 AS TEST3
	,CAST(11 AS VARCHAR(10)) CASTTEST3
	,TRUE AS TEST4
	,CAST(TRUE AS VARCHAR(10)) CASTTEST4
from
	VS_20_HDA_WINLINE_CCVBI.V_CWLGLACCOUNT_ACTUAL

 

 

mwellbro
Xpert

Hi @SebastiaanP ,

with these details I think I can offer you at least a workaround that won´t necessitate additional objects:

Firstly, to test on my end, I tried to reproduce your problem - since I haven´t geared up a virtual schema but it´s ( at least in our current case essentially ) a wrapper for inline IMPORTs I could get the same behavior in this way:

 

SELECT * FROM (IMPORT FROM EXA AT TO_EXA STATEMENT 'select param_value,null,''test'' from exa_metadata where param_name=''databaseName''')
UNION all
SELECT * FROM (IMPORT FROM EXA AT TO_EXA STATEMENT 'select param_value,null,null from exa_metadata where param_name=''databaseName''')
;

 

mwellbro_0-1635972903158.png

Interestingly enough it works when using an empty string ( which should also equate to a null value ) in the second import, but since you´ve already established that we end up with an actual null literal from the v-schema, lets run with that.

As it is the UNION operation that´s giving you a headache here, you could force the cast in an additional inline query, like so:

 

SELECT * FROM (IMPORT FROM EXA AT TO_EXA STATEMENT 'select param_value c1,null c2,''test'' c3 from exa_metadata where param_name=''databaseName''')
UNION all
SELECT c1,c2,cast(c3 AS VARCHAR(4)) FROM (SELECT * FROM (IMPORT FROM EXA AT TO_EXA STATEMENT 'select param_value c1,null c2,null c3 from exa_metadata where param_name=''databaseName''') )
;

 

mwellbro_1-1635973098398.png

A bit cumbersome, but it works  [around the problem] ( could probably wrap that into a view, but you said no additional objects, so here we are ).

As for why this is happening in the first place I´d guess this is the spot:

https://github.com/exasol/virtual-schema-common-jdbc/blob/main/src/main/java/com/exasol/adapter/dial...

mwellbro_2-1635973801617.png


This could probably be overriden through a custom implementation of an SQLGenerationVisitor class I think...or someone at Exasol decides this behavior should be changed - then again, I might be wrong entirely so perhaps @exa-SebastianB might correct me on this one.

Cheers,
Malte



exa-SebastianB
Team Exasol
Team Exasol

@SebastiaanP , you were asking how to find out which capabilities a Virtual Schema declares. There are two options:

a) read the code (e.g. ElasticSearchSqlDialect.java#L28)
b) check the output of the remote log
Here is a link to how to setup the remote log: https://docs.exasol.com/database_concepts/virtual_schema/logging.htm
I recommend variant b) since it is more convenient, you don't have to match version numbers of code and installed adapter and you get to see a lot of other interesting parts of the inner workings.
Note that each dialect has different capabilities.

exa-GeorgD
Team Exasol
Team Exasol

Hi @SebastiaanP,
thanks for raising this question in the community 👍.
I could reproduce the error with an Exasol virtual schema dialect. For this dialect I found the following two potential workarounds:
1.

 

ALTER VIRTUAL SCHEMA VS_20_HDA_WINLINE_CCVBI SET EXCLUDED_CAPABILITIES = 'LITERAL_NULL';

 


2.

 

 

select 
	C_GLACCOUNTNO 		as General_ledger_account_code
	, C_GLACCOUNTNAME 	as General_ledger_account
	, (SELECT CAST('' AS VARCHAR(18)))
	, 'Winline' 		as Source_application
	,1 AS TEST1
	,CAST(1 AS VARCHAR(10)) CASTTEST1
	,'1' AS TEST2
	,CAST('1' AS VARCHAR(10)) CASTTEST1
	,11 AS TEST3
	,CAST(11 AS VARCHAR(10)) CASTTEST3
	,TRUE AS TEST4
	,CAST(TRUE AS VARCHAR(10)) CASTTEST4
from
	VS_20_HDA_WINLINE_CCVBI.V_CWLGLACCOUNT_ACTUAL

 


Solving this issue is not trivial. As you can see in the pushdown results of EXPLAIN VIRTUAL, the Exasol compiler removes the cast in the query. This is an optimization that happens quite early and before the handling of virtual schemas. For queries without virtual schemas this reduces the execution time and is no issue. In the virtual schema scenario, the pushdown contains a null_literal without datatype. The remote database of the virtual schema then returns a NULL with an arbitrary JDBC Type. This can vary based on the dialect. For the Exasol virtual schema dialect this is Boolean. In theory, the Exasol compiler can reinsert the cast to the desired type. But while this is possible in your scenario, it is not possible for complex expressions. For that, the compiler needs exact types from the JDBC driver.

Thus, in a nutshell, this is https://community.exasol.com/t5/ideas/1-1-type-reporting-switch-in-jdbc-driver-migrated/idi-p/4829. Feel free to vote on the issue ;).

I hope the workarounds are sufficient for the time being until we have this feature in the JDBC driver.

Best wishes
Georg

 

View solution in original post

SebastiaanP
Contributor

I think disabling this capability is a good option for now. We prefer to not have to change the query, since we're using it in a sort of dev role. Should we run into any (performance) issues, option 2 from @exa-GeorgD is a good second option to look into.

Thanks all for your help!