Temporarily Switch off "Optimizer Statistics Gathering"

Contributor

Question: Is it possible to switch of "Optimizer Statistics Gathering" temporarily in a session?

Background of Question:

We finished a complete "continuous syncronisation" of our Oracle database with > 1000 tables based on autoamtically created triggers (insert,update,delete) and automatic use of primary keys (some are multifield primary keys). To synchronize data from Oracle to Exasol we currently use Database Link via ODBC and Oracle stored procedure. It works fine now and stable.

Initial migration (and therfore our desaster recovery), before we start continous synchronization, is done by Exasol Import-Script: https://github.com/exasol/database-migration/blob/master/oracle_to_exasol.sql

(By the way, is there a special Oracle Database Link available from Exasol (so that we do not need to use ODBC, see https://docs.oracle.com/database/121/OTGIW/configodbc.htm#OTGIW1041 )

For this purpose it is necessary to run DML-scripts against Exasol (inserts and deletes via database link). 

Everything works fine now, but we found out that the execution times of a single SQL-Insert into Exasol takes approx. 5 to 10 times longer than using any other target database.

We think, that this is due to the idea of Exasol, beeing fast in select, and accepting slowness in DML jobs. In general it is ok.

But we see, that with Exasol "Import ..."-Statements, thousands of DML-tasks can be done very fast. And we guess, that the "Optimizer Statistics Gathering" is switched off at the beginning of "import ..." and switched on at the end of "Import ...".

So, what we would like to do to speed up this synchronization is to switch of "Optimizer Statistics Gathering" before running a batch of DML-jobs (mixed deletes and inserts) and switch it on afterwards (or let run "Optimizer Statistics Gathering" afterwards by any command.

Due to this fact, we currently are able to handle about 20 DML statements per second, that might be enough for our change volume, but we did expect higher performance.

Snyhronisation procedure is currently started every 2 seconds. That means, Exaslo is synchron to Oracel with just a very little delay.

 

Thanks for your feedback (especially to @exa-SebastianB, because this is a follow up request to previous communications on continous sync). 

11 REPLIES 11

SQL-Fighter

Hi MaMerker,

I'd support mwellbro's advice zu check the profile. Additionally you could check if you have utilized all mentioned best practices: https://docs.exasol.com/connect_exasol/drivers/odbc/using_odbc.htm?Highlight=prepared%20statement#Be...

Especially the prepared statements are important in this context.

Best regards,

Kristof

Contributor

Dear @kristof & @mwellbro ,

thanks for your comments and hints, I will check profiler (have never used it before) and also check ODBC advices.

Best regards

Martin

Contributor

Result Profiler:

Statement 1: insert into (1 Datensatz) with auto commit

Statement 2: import into ... from ORA at OCI_ORACLE statement select ... with auto commit

 

PART_IDPART_NAMEOBJECT_NAMEOBJECT_ROWSOUT_ROWSDURATIONCPUTEMP_DB_RAM_PEAKHDD_READHDD_WRITE
1COMPILE / EXECUTE(null)(null)(null)0.02843.811.40.0(null)
2INSERTM_PRODUCT6610.01333.311.30.0(null)
1COMMIT(null)(null)(null)0.2272.511.00.00.2
1COMPILE / EXECUTE(null)(null)(null)0.06923.815.00.0(null)
2IMPORTM_PRODUCT67620.60311.138.00.0(null)
1COMMIT(null)(null)(null)0.2452.614.80.00.3

 

That means, each insert incl. commit takes about 0.2 seconds, means max 5 records per second.

So I will try to switch off auto commit in ODBC and do it at the end manually. I will write down my result here.

Team Exasol
Team Exasol

You probably know this, but just to make sure: you can chain rows in a single INSERT (see https://docs.exasol.com/sql/insert.htm). That is always preferable over single row inserts.

Example:

INSERT INTO t VALUES (2, 1.56, 'ghi'), (3, 5.92, 'pqr');  

 

Contributor

Dear Sebastian,

thanks for your hint. This might be a further option for optimzation.

But currently I´m focusing on the ODBC autocommit and I tried to switch it of.

but I didn´t succeed.

As already mentioned, we are using Oracle Gateway for ODBC and Exasol ODBC-driver.

We tried to set autocommit off by modifying corresponding InitODBC.ORA:

# This is a sample agent init file that contains the HS parameters that are
# needed for the Database Gateway for ODBC
# HS init parameters
HS_FDS_CONNECT_INFO = PLSYNCDB
HS_FDS_TRACE_LEVEL = ON            
HS_TRANSACTION_MODEL = COMMIT_CONFIRM  # das ist der default
#HS_TRANSACTION_MODEL = SINGLE_SITE    # set by Martin Merker
#
# Environment variables required for the non-Oracle system  #hier Exasol
#set <envvar>=<value>
set AUTOCOMMIT=N

 

But without effect. We are sure that this file is taken from Oracle, because wrong parameter in Oracle-Part lead to mistakes when creating the database link.

Do you have any idea how to switch off autocommit in such environment?

 

Further question: How can I profile such a session. I do not have a possibilty to run command like "ALTER SYSTEM|SESSION SET PROFILE=’ON’;"

 

Thx.

Team Exasol
Team Exasol

The user guide says this about index generation:


    Exasol creates and maintains required indexes automatically. It is not recommended to interfere in this.

 

Now to the question of whether manual committing can improve performance. The commit overhead on single row inserts is huge -- hence the hint to accumulate them in one insert programmatically. You could check EXA_SQL_LAST_DAY and compare the situation with auto-commit for every single row insert and manually commiting after a 1000 or so inserts. The later case should be faster. How much you will have to measure in your concrete case.

I am not an ODBC expert, but the documentation says you can switch auto-commit off via connection string.

All in all its always better to improve the algorithms you use for data migration than to try tweaking database parameters.

SQL-Fighter

Is this a dev environment? If so you could enable profiling for the whole system to see all queries in exa_dba_profile_last_day.

ALTER SYSTEM SET PROFILE = 'ON'

If this is a production environment I'd be a bit more careful and have a closer look if enabling impacts the performance. From my experience it should be fine though, but it always depends on what's actually going on in your database.

Of course there's always the possibility to enable it for a short duration and then disable it again as soon as the queries are finished.

Contributor

@exa-SebastianB see request before, I forgot to link you with @.

Contributor

@exa-SebastianB 

I found this in your documentation:

Autocommit-Modus

Bitte beachten Sie, dass der Autocommit-Modus unter Windows am besten
nur mittels SQLSetConnectAttr() ausgeschaltet werden sollte und nicht in
den Einstellungen der Datenquelle. Der Windows Treibermanager bekommt
diese Änderung sonst nicht mit, geht davon aus, dass Autocommit eingeschaltet ist, und leitet SQLEndTran() Aufrufe nicht an die Datenbank weiter.
Dieses Verhalten könnte zu Problemen führen.

But I´m not able to apply this in my environment with database link.

Xpert

Hi MaMerker,

I´m not quite sure what you try to gain from disabling Autocommit ? If I understood your architecture correctly ( from the script a few topics back ) then you´re going for different pquerys for every insert, so you´d necessarily incur a commit at the end of each individual statement - or did I get this wrong ?

The second thing I don´t get is how the settings at ora should impact you at all, since the commit you profiled is purely exa.
Could you provide some kind of snipped that could give us a closer look on what is happening ?

Also: "I do not have a possibilty to run command like "ALTER SYSTEM|SESSION SET PROFILE=’ON’;"" => how did you get your previous profiles ? Different DB-Instance ?

Xpert

Hi MaMerker,

just so I get this correctly: " (...) And we guess, that the "Optimizer Statistics Gathering" is switched off at the beginning of "import ..." and switched on at the end of "Import ...". (...) "
=> were you able to verify that by means of a Profile ? I´m just thinking if it´s actually the case that you are loosing performance to the gather-stats step than we should be able to see it there.
And to address your actual question: I don´t know of a way to switch off the Optimizer Statistics Gathering - but I´m interested to see what EXA has to say on this one 🙂

Best regards,
Malte