Tech Blog

New Join Order Optimizer in Exasol

Starting with 7.0, Exasol offers a new Join Order optimizer. This short article provides details about the new join order optimizer and answers some common questions.

Background

Determining the best join order is a complex task and an ongoing research topic. Parts of the problem are NP-Hard (see https://doi.org/10.1145/1270.1498). In general, finding the best join order consists of three parts, namely Cardinality Estimation, Cost Model, and Plan Enumeration. Cardinality Estimation creates assumptions about database properties like data distribution in columns. The Cost Model uses these assumptions and a fixed execution plan to approximate the execution time of a join order. The Plan Enumeration searches for the execution plan with minimal costs.

For more details, the following links provide a good starting point:

New Optimizer in Exasol

We built a new plan enumeration algorithm from scratch. It is faster as the previous one, inspects many more join orders, and is easier to extend in the future. It also has cleaner interfaces to the cost model and the cardinality estimation. As far as possible we kept the current behavior (Cardinality Estimation, Cost Model, join method selection, replication handling). However, it was necessary to make some slight changes due to the new interfaces.

The new join order algorithm is deactivated by default in 7.0.x and 7.1.x. It can be enabled per session:

CONTROL SET JOIN OPTIMIZER EXPERIMENTAL;

Alternatively, it can be enabled database-wide using the command-line parameter -joinOrderMethod=1. In that case individual sessions can switch back to the old default optimizer:

CONTROL SET JOIN OPTIMIZER DEFAULT;

To clear the session setting and rely on database-wide setting, use:

CONTROL CLEAR JOIN OPTIMIZER;

Changing the session setting doesn’t affect queries already stored in the query cache!

The new optimizer is called preview and EXPERIMENTAL. Is it production ready?

This is … complicated 😉.
Let’s look at what could possibly go wrong:

  1. Long compile time of the query due to the new optimizer.
    Please create a support ticket for these cases. We will try to find a workaround and we also handle this like any other bug.
    Of course, switching to the old optimizer is also valid. If such a problem occurs, there is just one difference from other bugs:
    As we in R&D want you to use the new optimizer, we will fix them as fast as possible 😉.
  2. Non-optimal join queue with high execution time due to a bug in the new optimizer.
    Everything in issue 1 applies here. Create a Support ticket in the usual way.
  3. Non-optimal join queue with high execution time due to a valid decision in the new optimizer.
    This is the complicated case. We still want you to create a support ticket with all the necessary information. We want to collect all these cases to determine the best way to extend our optimizer in 8.0. However, we will only address these issues in Exasol 8.0.x. In older versions like 7.0.x and 7.1.x the only workaround will be to switch to the old optimizer.

At the time of writing this article, there are no known bugs that would prevent the use of the new optimizer in a production environment with Exasol 7.0.10 and above.

When do we have to switch to the new optimizer?

The new optimizer will be the default configuration in 8.0. It will still be possible to use the old optimizer in 8.0 and we will also provide support for it. We have not yet determined the end of support for the old optimizer. We will communicate this in advance with sufficient time to prepare.

But why did you call the new optimizer EXPERIMENTAL?

This happens if you allow R&D to choose the keyword and not marketing. But we will improve that, and we will switch to a better name in 8.0. Probably we will choose something like WORLDS_SHINIEST_OPTIMIZER 😁.

So, we in R&D want to encourage you to use the new optimizer and we are looking forward to your feedback!

exa-GeorgD