Materialized view to store UDF results

schiegg
Contributor

As recommended by @exa-Chris I start the topic here as my question could get lost in the Ideas section.

I am currently computing multiple input tuples through a Java UDF SET script EMITting multiple tuples as well. I wrapped the SELECT in a VIEW specifying some conditions. It is stated elsewhere that Exasol caches the view results intelligent as it is not possible to create materialized views. My UDF needs round about 12 seconds to return results for ~30000 input tuples without optimization of the JVM or anything else. Reducing this for future responses by using caches, which should be sufficient as the outcome does not change as long as the source stays identical, would therefore significantly improve my access times.

Use-case is a follow up of a theoretical research paper proposing this. Till now, my measured access times stay nearly identical over multiple requests. Is there any way to trigger the caching or any form of materialization? I already implemtented an ELT workflow but the paper explicitly requests a materialized VIEW also.

6 REPLIES 6

exa-Uwe
Moderator
Moderator

A "materialized view" is essentially just a table. As if you do a CREATE TABLE myMV AS SELECT ... here comes your view query.

Of course you can do that with Exasol.

In Oracle, you can do some additional sophisticated things with Materialized Views, like enabling the optimizer to use them for query rewrite: https://uhesse.com/2009/07/08/brief-introduction-into-materialized-views/

 

Best regards

Uwe

schiegg
Contributor

Hi Uwe, I am aware of that. Thats why I explicitly mentioned that I already implemented it that way but would be interested in a way of doing it with the VIEW argument and either some intelligent caching or refresh options. If it isn't possible, thats just fine as well. I am also interested if I missed anything.

exa-Uwe
Moderator
Moderator

Apart from creating a table from it, there's no way to materialize a view in Exasol. The query constituting the view will run each time you access that view. It may be subject to optimizing and caching just as any other query running against Exasol. There's nothing to "refresh" here, because the query simply runs again.

The whole concept of Materialized Views, refresh and query rewrite etc.  is very Oracle-specific and has no counterpart in Exasol. We may be running just fast enough without it 😉

 

mwellbro
Xpert

Hi @exa-Uwe , you piqued my interested there when mentioning "subject to optimizing and caching " 😊
Just a supposition: a query / view containing a UDF would not be eligible for entering the query cache since exasol wouldn´t be able to guarantee that the output is repeatable ( because it wouldn´t know and/or couldn´t parse what is going on within the UDF from a logical point of view....could be return rand() ) - is that correct ?
I do seem to recall that some DB implemented a DETERMINISTC keyword for in UDF creation - might be interesting, perhaps...
Of course, and I´m mentioning this not entirely serious, one could write a preprocessor-script that would transparently CTAS the result and check the LAST_COMMIT of the base tables - if not changed it could direct the query towards the previously "CTAS´ed" table, which could be eligible for QueryCache execution...still would be over the top for a 12 second query...just sayin´ 🤔🙂

exa-Chris
Community Manager
Community Manager

Thanks for this. Much better place. Would you also have some information about the system you are using? Single Node or a cluster?

Connecting Customers, Partners, Prospects and Exasolians is my passion. Apart from that I cycle, listen to music, and try to understand what all those technical discussions really mean...

schiegg
Contributor

Single node docker container with minimal ressource requirements right now as this is just the beginning of research.

If the caching depends on the hardware, what would be recommended to add? Can I somehow read a log of restrictions applied due to limited hardware?