parallel query performance issue

ashutoshe2open
Contributor

Hi ,

I am using EXASolution-7.0.10 community edition(  EXASOL-Single-Node-7.0.10). When I run single query it perform very well but when I run query parallelly in multiple sessions ,Performance degraded drastically. can you guide any setting or doc that help me improve parallel query performance.

I am doing POC to compare timing between MSSQL and exasol . If exasol perform better then I want to move from MSSQL server to exasol. 

9 REPLIES 9

mwellbro
Xpert

Hi @ashutoshe2open  ,

just a general thing: a single node instance will handle concurrency not as well as a cluster, maybe a 2 node setup would be more appropriate for your POC ?

Other than that: the runtime behavior under Exasol will be different from a classical RDBMS when it comes to concurrency, what you describe as "drastic degradation" in performance might be attributed to the resource of the system hitting a limit.

On an ORACLE or MSSQL you can throw more queries in parallel against the system while maintaining runtimes and also not fully utilizing the system - on Exasol a single query will utilize your entire system, if more queries run ( given same resource_group configuration ) they´ll divide resources equally (= hence each execution will have less "power" for itself ).

I would advise to activate auditing on your DB and checking the RESOURCE column in exa_dba_audit_sql, you should see
reduced runtime correlated with reduced resource value.
Could you share the specs for the MSSQL server and your Exa-Community box ? Perhaps along with the timings you observed ?

Cheers,
Malte

ashutoshe2open
Contributor

Thanks I will enable Auditing and check. Also can we add multiple node on community ? if yes can we provide any link  how to do that.

Below are the config and timing details:

Database Server Configuration

##ComponentExasolSQL Server
1Cores2828
2Memory128 GB128 GB
3Operating SystemCentos LinuxWindows Server 2019 Standard 64-bit
4Database EditionExasol 7 (community )SQL Server 2019 Enterprise

 

Report Performance Timings(Running 3 parallel sessions)

 


Report NameSQL Server TimingExasol Timing

Test Report 2

8 minute 39 second8 minute 27 second
Test Report 28 minute 49 second8 minute 38 second
Test Report 28 minute 51 second8 minute 34 second
Test Report 310 minute 6 second18 minute 12 second
Test Report 310 minute 23 second17 minute 49 second
Test Report 310 minute 28 second18 minute 14 second

 

Report Performance Timings(Running 6 parallel sessions)

 


Report NameSQL Server TimingExasol Timing

Test Report 2

10 minute 13 second23 minute 29 second
Test Report 210 minute 15 second23 minute 56 second
Test Report 210 minute 17 second23 minute 50 second
Test Report 316 minute 18 second94 minute 3 second
Test Report 316 minute 69 second94 minute 10 second
Test Report 316 minute 25 second94 minute 4 second

 

Thanks,

Ashutosh

mwellbro
Xpert

I think the community edition shares it´s "single node nature" with the single node cloud variety so I´d assume you can´t scale a community version ( @exa-MathiasHo : could you get an exa-* to confirm or deny ? 🙂 ) :

mwellbro_0-1632743658986.png


Considering the 28 cores in your setup I´m almost more surprised by the runtimes "per se" than the scale factor - could you disclose how big of an instance we are talking about in terms of GB ( Storage and DBRAM ) ?

Cheers,
Malte

exa-Kristof
Team Exasol
Team Exasol

Hi @ashutoshe2open ,

some questions in addition to Malte's:

Regarding Malte's question: no, the community edition cannot be tested in a multi-node setup. Signing up for our 30-day-free-test-trial would be the easiest option: https://www.exasol.com/test-it-now/cloud/

Best regards,

Kristof

Note: corrected some information about the community edition + multi-node setup which was wrong before.

ashutoshe2open
Contributor

Thanks Kristof, I will try these. I cant share the sql but I am using sql queries generated by MSTR. So if you have any recommendation on MSTR setting (VLDB SETTING)  . Please let me know, I will try with those.

Below is the size details. Our actual data in in TBs. but due to community edition 200GB limitation I am testing with this small data set. (both sql and exasol).

I cannot try cloud because of company policy.

 

ashutoshe2open_1-1632747992200.png

 

ashutoshe2open
Contributor

Hi Malte,

Below is the size details. Our actual data in in TBs. but due to community edition 200GB limitation I am testing with this small data set. (both sql and exasol).

ashutoshe2open_2-1632748571195.png

 

exa-Nico
Community Manager
Community Manager

In addition to what Kristof said:

Exasol will create all necessary indexes on the first run of a query. So on the second run of the query (assuming the first query committed), it can use the index and therefore have much better performance than the first run of the query. I'm not sure if that's what happened here, but the profile would tell us 😁

 

Sports Enthusiast. Database Guy. Member of Team Exasol.
Having trouble? Just let me know!

Please Give Kudos if you find this post useful and also mark this as Accepted Solution if this is the required solution

exa-Kristof
Team Exasol
Team Exasol

Hi @ashutoshe2open ,

yes, we have MSTR best practices/setup instructions. Easiest is to follow them in our docs: https://docs.exasol.com/connect_exasol/bi_tools/microstrategy/microstrategy.htm

This article will also lead to a MicroStrategy website with some ressources: https://community.microstrategy.com/s/article/Exasol-6-x?language=en_US

Regarding your screenshot: If this shows the entire database, then all operations should be in-memory and thus very fast with the amount of available RAM. So please check it in ExaOperation (as described above) if the RAM is actually allocated to the database.

Those runtimes seem very high to me considering your setup. But then again it's hard to judge without profiles/sqls. Any chance to provide them via PM?

Best regards,

Kristof

mwellbro
Xpert

Hi Ashutosh,

taking in these sizes and keeping in mind your 128 GB system mem I find these timings all the more puzzling.
Like Kristof suggested, check the allocated DBRAM:

select CURRENT_TIMESTAMP as checked_on,DB_RAM_SIZE from exa_system_events order by measure_time desc limit 1;

9.6 GB schema size should fit nicely into your 128 GB and leave ample space for TEMP-operations...

As you´ve mentioned MicroStrategy:

1) are your test reports "single pass selects" or more on the side of multiple passes with temp table creation and drops ?
2) do you by any chance make heavy use of "custom groups" in MicroStrategy ? Those can rack up quite a few passes just for themselves, driving execution time.
3) do you use dynamic sourcing and combine different data sources in your report ? This can lead to
interesting "row-by-row" loads for temporary objects
4) Are there transaction services involved in your test set ? Just thinking that with Exasols table level
lock paradigm you could end up heavily serializing executions when your implementation contains
DML against the same objects in parallel from different sessions...

Would love to hear what you uncover, also what your exa_dba_audit_sql has to say about your queries ( as you can´t share the SQLs, perhaps just a part of the columns there could be shared here ? ).

-- and maybe a column to tell us which statement belonged to which test report
select session_id,stmt_id,command_name,start_time,stop_time,duration,cpu,hdd_read,net,row_count,resources from exa_dba_audit_sql where start_time>=CURRENT_DATE order by start_time;

 
Hope this helps in any way.

Cheers,
Malte