Fully managed cloud. 30-day FREE full featured trial. Start Now
cancel
Showing results for 
Search instead for 
Did you mean: 

How to find concurrent queries thar are listed in EXA_USAGE_HOURLY

SusanneK
SQL-Fighter

Hey everyone,

I have a question about the statistics in EXA_USAGE_HOURLY. We noticed a huge increase of QUERIES_MAX. This is why we tried to find the queries that are causing this increase. We have enabled the SQL Audit, so we tried to search for the concurrent queries there. Unfortunatelly the data that we can reproduce with the help of the table EXA_DBA_AUDIT_SQL does not fit the data we are seeing in EXA_USAGE_HOURLY. Sometime we see more concurrecnt queries and sometimes we see less concurrent queries. Does anyone have an idea what we are missing? (Warning: the following query takes several minutes in our setup)

with concurrent_queries_per_starting_qry AS
(
select s1.START_TIME,
date_trunc('hour', s1.START_TIME) as start_hour,
s1.STMT_ID,
count(1) as cnt_concurrent_queries_at_start_time

from EXA_STATISTICS.EXA_DBA_AUDIT_SQL s1

left join EXA_STATISTICS.EXA_DBA_AUDIT_SQL s2
on s1.START_TIME between s2.START_TIME AND s2.STOP_TIME

where s1.START_TIME like '2022-01-06 14%'
group by 1, 2, 3
)

select
start_hour,
h.QUERIES_MAX as queries_max_from_exa_usage_hourly,
max(c.cnt_concurrent_queries_at_start_time) as queries_max_self_computed

from concurrent_queries_per_starting_qry c

left join EXA_STATISTICS.EXA_USAGE_HOURLY h
on h.INTERVAL_START = c.start_hour
group by 1,2
;

We do not understand what the metric USERS_MAX should tell us. The EXA_USAGE_HOURLY says that we have most of the time 100-200 users, but we have only 60 users registered at all in the whole Exasol DB. What could be counted for the statistics table?

We appreciate any help.

Thanks and kind regards

Susanne

13 REPLIES 13

mwellbro
Xpert

Hi @SusanneK ,

for the 2nd question I´d say we could look into the comments on the statistics table:

mwellbro_0-1643923259474.png

I think this would mean that some of your 60 users do hold more than a single connection during the 1-hour interval ( I haven´t tested this assumption, it´s just a guess at this point ).

For the first question things might get a bit more complicated - there are two things I´d take issue with concerning the structure of your check query:
1. using count(1) will probably get you in trouble if you have sessions whose stmt´s occupy the same timeslot, i.e. stmt_id´s 1 and 2 overlap ( can happen in the case of EXECUTE SCRIPT commands ) - we´d probably get around this if we say count(distinct session_id) to only count concurrent active distinct sessions ( which is what I believe the stats table is doing, but again, just guessing ).
2. taking exa_dba_audit_sql as a starting point and issuing "(...)s1.START_TIME like '2022-01-06 14%'(...)" will ignore queries that started before 14:00 but are still running within the interval ( which would qualify them to be counted towards concurrency in the interval ).

Maybe the following bit of code could be useful in these circumstances:

 

 

ALTER SESSION SET TIMESTAMP_ARITHMETIC_BEHAVIOR = 'DOUBLE' ;

with dt as (
select TO_TIMESTAMP('2022-01-06 14:00:00.000000','YYYY-MM-DD HH24:MI:SS.FF6') + (level/24/60/60/10) idx from dual connect by level<=(36000-1)
)
, base as (
select idx,count(distinct session_id) cnt from dt a 
left outer join (select * from exa_dba_audit_sql where start_time>=CURRENT_DATE - 30 ) b on a.idx between b.start_time and b.stop_time
group by 1
)
select trunc(idx,'HH'),max(cnt),(select queries_max from exa_usage_hourly b where trunc(idx,'HH') = b.interval_start) as concurrency_as_per_usage_hourly  from base a 
group by 1
order by 1;

 

 


dt will create an "artificial timeline" with the granularity of "fraction of a second" to cut into the exa_dba_audit_sql so we get every session_id that is executing in any given timeslot (fractions might seem over-the-top but I actually needed this because seconds were not precise enough - at least on the environment I tested it) - counting those in a distinct fashion should give us the value the stats table is recording.

As a little bonus, I´d expect this to run a bit faster than the original query since we only visit the exa_dba_audit_sql once and
we issue a type-correct (well, almost ) filter ( start_time like .... will probably end up in a full scan or some weird expression index we don´t want 😉  ).

That being said: in this structure it won´t scale too great so it´s probably ok to look at a single hour but if you´d want to look at longer intervals you should get rid of the correlated sub-query I haphazardly threw in there and keep in mind that "connect by level" to generate rows isn´t a great idea either.

Cheers,
Malte

SusanneK
SQL-Fighter

Hi @mwellbro, thanks a lot for your reply. With your query I get the same results. In our case this is 6 concurrent queries self computed and 14 concurrent queries found in EXA_USAGE_HOURLY. Which means, I am not able to find most of the 14 concurrent queries in the EXA_DBA_AUDIT_SQL. Any idea, what kind of queries that could be? Are there any queries not logged in EXA_DBA_AUDIT_SQL ? How can find out what is causing this?

mwellbro
Xpert

Hmmm, I´m not aware of any kind of query that´s not part of the audit tables...thinking on it...

mwellbro
Xpert

Might I ask for the result of the following query ?

export (select session_id,stmt_id,start_time,stop_time from exa_dba_audit_sql
where start_time between TO_TIMESTAMP('2022-01-06 00:00:00.000000','YYYY-MM-DD HH24:MI:SS.FF6') and TO_TIMESTAMP('2022-01-06 23:59:59.000000','YYYY-MM-DD HH24:MI:SS.FF6')
order by start_time  )
into local csv file 'c:\temp\some_audit_data.gz'
;


Should be anonymous enough to be shared, I hope ?

SusanneK
SQL-Fighter

Hi @mwellbro Thanks a lot for looking into this. Unfortunatelly this forum does not like gz. I chose .zip, so you will have to unzip it before importing.

mwellbro
Xpert

I got the data loaded, thanks - I ´d still like to stand with the old saying "audit or it didn´t happen", so I´d hope that the 14 from exa_monitor_hourly are somehow incorrect/skewed...not sure yet how to verify that, however...
While I keep on thinking about how we might get a look at this from the outside, maybe someone from the inside could chime in ?
Maybe,  @exa-Chris , since you already browsed through this thread, you could nudge someone ?

exa-Gabi
Team Exasol
Team Exasol

There is simply not enough information here to make an informed judgement whether there is a bug or not in the system. The best way to move it forward would be to create an issue in our ticketing system and we will ask for more information there. Barring that, if you are comfortable with sharing system data here, do let us know and we'll ask for the additional information here.

EXA-Dude, database developer, history buff. Member of Team Exasol.

mwellbro
Xpert

@SusanneK : if you do pursue the path of opening a ticket, could you share the result of said ticket here ? Really curious on this one 🙂

@exa-Gabi : setting the question of "bug or no bug" aside for the moment: it´s still valid to expect that all queries that have been taken into account in the exa_usage_hourly can be traced in the exa_dba_audit_sql , right ?

exa-Gabi
Team Exasol
Team Exasol

@mwellbro : Yes, every single query that runs in the system is taken into account into both EXA_USAGE_* and EXA_DBA_AUDIT_SQL system tables.

EXA-Dude, database developer, history buff. Member of Team Exasol.