Use Case:
We have 2 Exasol instances. There are cases where the local instance pulls data from the remote instance via an IMPORT-FROM-EXA statement.
Sometimes the statement that’s executed on the remote instance is querying a virtual schema which generates its pushdown SQL by querying a semantic layer API.
Problem:
IMPORT-FROM-EXA always runs the statement on the remote instance as a prepared statement i.e it makes 2 remote calls: prepare + execute.
This results in 2 separate calls to the virtual schema pushdown SQL generator which results in 2 semantic layer API calls which can slow things down a bit (There is nothing in the virtual schema parameters that indicates whether it’s in the prepare phase or execution phase).
There is a related problem that IMPORT-FROM-EXA issues a ROLLBACK on the remote instance after execution. This means that any auto generated indexes are not saved.
Suggested Solution:
It would be great if an IMPORT-FROM-EXA could accept connection string parameters that control its behaviour on the remote instance in the same way that IMPORT-FROM-JDBC supports parameters.
Then I’d suggest 2 parameters: execute_only and auto_commit
Setting them would cause the remote connection to skip the prepare phase and to issue a COMMIT afterwards.
Thank you for the consideration.
Peter
The IMPORT-FROM-EXA
statement currently requires a preparation step to retrieve column metadata prior to execution. To support a single-step (execute-only) import, the Exasol loader would need to be modified to execute the statement immediately: retrieving only the result set metadata (without transferring any data), constructing the transformation pipelines, and then starting the data transfer process, all in one step. Currently, this process is logically divided into two distinct phases:
Separately, introducing a parameter like (remote_)auto_commit
would make sense. This would prevent the default rollback behavior and allow auto-created indexes (e.g., resulting from a join in the remote query) to be preserved after import.
Thanks @Ruslan_Rusinov - that makes sense.
Then perhaps instead of the single-step import I would request that you add a flag to the virtual schema pushdown request parameters (or global exa.metadata parameters) indicating whether the current request is in the prepare vs execute phase because (in our use case) there are some expensive steps we can skip if we know it’s just the prepare phase.
Let me see if I understood your setup and challenge correctly:
Database A → ExaLoader (IMPORT
) → VS on database B →Target schema on database B
And you would like the VS on B to react differently between two calls?
If my understanding is correct, we would need an SQL extension for the SQL the ExaLoader generates that indicates in the query that this query is special. I am guessing that the “prepare” queries the metadata tables of B via SQL only. An assumption that the loader team would need to confirm.
@peterkioko could you paste and anonymized and shortened VS log of the metadata request maybe?
And the ExaLoader would need to know that it is talking to a VS, not a regular schema.
Another question: the VS that you are talking about that acts as semantic layer is your custom development?