Unexpected behavior with window functions

helli
Padawan

In our DWH we added for every Code-Field  (ending in _c) the corresponding Short and Long-Description (ending in kbez_x, lbez_x) as separate fields. In the example below e.g inkasso_typ_c, inkasso_typ_kbez_x, inkasso_typ_lbez_x. Unfortunatly there are single description fields also ending in _kbez_x or _lbez_x. I want to query the DB-catalog to find all these single desc fields. So in a "code constellation" i have 3 fields with the same base name eg. inkasso_typ (_c/ kbez_x/ lbez_x). In a "none code constellation" i have < 3 fields with the same base name but also ending in _kbez_x/ lbez_x. (eg. pers_grp_kbez_x)

WITH mytable AS (
SELECT 'd_pers_grp' AS column_table, 'pers_grp_sk' AS column_name UNION ALL
SELECT 'd_pers_grp', 'log_beg_d' UNION ALL
SELECT 'd_pers_grp', 'log_end_d' UNION ALL
SELECT 'd_pers_grp', 'pers_grp_typ_c' UNION ALL  -- first code field group
SELECT 'd_pers_grp', 'pers_grp_typ_kbez_x' UNION ALL -- first code field group
SELECT 'd_pers_grp', 'pers_grp_typ_lbez_x' UNION ALL  -- first code field group
SELECT 'd_pers_grp', 'pers_grp_beschr_x' UNION ALL
SELECT 'd_pers_grp', 'inkasso_typ_c' UNION ALL  --second code field group
SELECT 'd_pers_grp', 'inkasso_typ_kbez_x' UNION ALL  --second code field group
SELECT 'd_pers_grp', 'inkasso_typ_lbez_x' UNION ALL  --second code field group
SELECT 'd_pers_grp', 'pers_grp_kbez_x' UNION ALL   -- standalone desc field i want to find
SELECT 'd_pers_grp', 'pers_grp_lbez_x' UNION ALL   -- standalone desc field i want to find
SELECT 'd_pers_grp', 'pers_grp_tech_id'
)
SELECT * FROM
(
SELECT column_table, lower(column_name) AS column_name
, regexp_replace(regexp_replace(regexp_replace(lower(column_name), '_kbez_x$', ''), '_lbez_x$', ''), '_c$', '') AS base_name
, count(*) OVER (PARTITION BY column_table, LOCAL.base_name) AS cnt
FROM mytable
) subsel
--WHERE (column_name LIKE '%kbez_x' OR column_name LIKE '%lbez_x' OR column_name LIKE '%_c')
WHERE (column_name LIKE '%kbez_x' OR column_name LIKE '%lbez_x' )


I have a subselect to determine the basename and the count(*) over the basename. In the outer SQL i want to filter on _kbez_x oder lbez_x fields with a count(*) < 3. The SQL does not behave in a way i expect (and know from other DB's). As far as i know the WINDOW Functions ( my count(*) over (partition ...) is applied after the WHERE. So i put the count into a subselect and afterwards use the value in the filter of the outer SQL. In my example the WHERE of the outer SQL seems to affect the window-function of the inner sql. You can see it if you comment one of the 2 given WHERE clauses. 

Did I misunderstand? Is it a feature?

 

 

1 ACCEPTED SOLUTION

exa-GeorgD
Team Exasol
Team Exasol

Hi @helli,

your expectations are correct. This is a bug in Exasol.
The database must not push down the filter into the subquery with the analytic function.
Thank you for reporting the issue. I already created an internal bug ticket. R&D will fix it in one of the upcoming releases.
If you want to be informed of the progress and the fixed version please open a support ticket.

@mwellbro : Thank you for answering so fast. Using group by 1,2,3 looks like a feasible workaround for me, but this depends on @helli at the end 😉.

Best wishes
Georg

View solution in original post

3 REPLIES 3

exa-GeorgD
Team Exasol
Team Exasol

Hi @helli,

your expectations are correct. This is a bug in Exasol.
The database must not push down the filter into the subquery with the analytic function.
Thank you for reporting the issue. I already created an internal bug ticket. R&D will fix it in one of the upcoming releases.
If you want to be informed of the progress and the fixed version please open a support ticket.

@mwellbro : Thank you for answering so fast. Using group by 1,2,3 looks like a feasible workaround for me, but this depends on @helli at the end 😉.

Best wishes
Georg

View solution in original post

mwellbro
Xpert

Hi @helli , 

interesting case - would also have assumed that the window function once encapsulated into a subselect would take precedence...

I think there is actually a filter being pushed where it might not should be pushed to - if you put a "group by 1,2,3" into
the subselect I think you´ll see the outcome you are expecting ?

Interestingly enough my "weapon of choice" ( order by false ) did not produce a result different from your initial post, the actual group by was necessary...but I might have the precedence confused here - maybe someone at EXA could enlighten us ?

Cheers,
Malte

helli
Padawan

Hi @mwellbro 

thanks for your post and fast answer! The GROUP BY does the trick and I get what i want. We agree that it shouldn't be necessary. Looks like an optimization artefact.

@exa-GeorgD Thanks for opening a Ticket and fixing this. For me its not critical and no need to open a support ticket. Still interested your progress so I discuss with my management.

Kind regards

Daniel