Missing support for native XOR in Conditions

profect_bso
Padawan

In our software one can easily select from tables according to the set theory. And you can additionaly restrict the subsets to different counting bases. For example one could select: "Hey give me all customers which are older than 18 and which have cars with no more km than 1000000 and contracts of type leasing." I am currently trying to support XOR connections between simple subsets. Unfortunately a native XOR support does not exist and so we rewrite the query to non-native XOR. Following query throws error: 

SELECT
    COUNT( DISTINCT "KUNDEN"."KNR")FROM
    "EXAMPLE"."KUNDEN"WHERE
	(
    	"KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "EXAMPLE"."KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' )    	AND
		"KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "EXAMPLE"."KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' )
    ) 
    OR 
    (  
    	"KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "EXAMPLE"."KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' )    	AND
    	"KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "EXAMPLE"."KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' ) 
    );

Error: SQL-Fehler [0A000]: Feature not supported: more than one EXISTS per predicate

Any suggestions? Any plans to support native XOR?

6 REPLIES 6

mwellbro
Xpert

Hi @profect_bso ,

not sure if this will help you, but if I understand "your software" correctly then the idea here is to build "blocks of sets" to use as filters - maybe this would be an alternative ?

 

with base(knr) as (
-- this would be "put together like your previous OR constructs
-- just add the branches of independent OR´s with union
SELECT
DISTINCT "KUNDEN"."KNR" FROM
"KUNDEN" WHERE
( "KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' ) AND "KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' ) )
union
SELECT
DISTINCT "KUNDEN"."KNR" FROM
"KUNDEN" WHERE
("KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' ) AND "KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' ) )
)
-- count the resulting set
select count(distinct knr)
from base
;

 

@exa-Aleksandr : was able to produce the error with your "toy data set" once I added " insert into kunden select 8+level,'aktiv','Privatperson','Ja' from dual connect by level<=500000; " ( yes, I know there is a cool new way to generate rows - it was late and I was lazy 😉 )

mwellbro
Xpert

Hmmm, interesting:

 

create table KUNDEN(
KNR int
, KUNDENSTATUS varchar(100)
, PERSONENART varchar(100)
, HAUSHALTSFUEHRER varchar(100)
);

insert into KUNDEN values
(1, 'aktiv', 'Privatperson', 'Ja')
, (2, 'aktiv', 'Privatperson', 'Nein')
, (3, 'aktiv', 'Firma', 'Ja')
, (4, 'aktiv', 'Firma', 'Nein')
, (5, 'expired', 'Privatperson', 'Ja')
, (6, 'expired', 'Privatperson', 'Nein')
, (7, 'expired', 'Firma', 'Ja')
, (8, 'expired', 'Firma', 'Nein')
;

SELECT
COUNT( DISTINCT "KUNDEN"."KNR")FROM
"KUNDEN"WHERE
(
"KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' ) AND
"KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' )
)
OR
(
"KUNDEN"."KNR" NOT IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."KUNDENSTATUS" = 'aktiv' AND "KUNDEN"."PERSONENART" = 'Privatperson' ) AND
"KUNDEN"."KNR" IN ( SELECT "KUNDEN"."KNR" FROM "KUNDEN" WHERE "KUNDEN"."HAUSHALTSFUEHRER" = 'Ja' )
);

 

yields:

 

mwellbro_0-1617202887801.png

 

Probably something to do with how at this low data volume the processing is structured ? Or did I deviate from the original statement ?

 

exa-Aleksandr
Team Exasol
Team Exasol

Good catch, @mwellbro !

I didn't check the initial query on my toy setup.

Yes, IN -> EXISTS transformation applies starting from a particular subquery size.

There is even an Idea for this topic: https://www.exasol.com/support/browse/IDEA-190 that would hopefully move to the new Ideation portal here in community.exasol.com.

exa-Aleksandr
Team Exasol
Team Exasol

Hi @profect_bso ,

The solution apparently heavily depends on data model and detailed requirements.

But couldn't you use just

create schema if not exists EXAMPLE;

create table EXAMPLE.KUNDEN(
KNR int
, KUNDENSTATUS varchar(100)
, PERSONENART varchar(100)
, HAUSHALTSFUEHRER varchar(100)
);

insert into EXAMPLE.KUNDEN values
(1, 'aktiv', 'Privatperson', 'Ja')
, (2, 'aktiv', 'Privatperson', 'Nein')
, (3, 'aktiv', 'Firma', 'Ja')
, (4, 'aktiv', 'Firma', 'Nein')
, (5, 'expired', 'Privatperson', 'Ja')
, (6, 'expired', 'Privatperson', 'Nein')
, (7, 'expired', 'Firma', 'Ja')
, (8, 'expired', 'Firma', 'Nein')
;

SELECT
	*
FROM
	EXAMPLE.KUNDEN k
where
	1=1
	and (
		(("KUNDENSTATUS" = 'aktiv' AND "PERSONENART" = 'Privatperson') and not ("HAUSHALTSFUEHRER" = 'Ja'))
		or
		(not ("KUNDENSTATUS" = 'aktiv' AND "PERSONENART" = 'Privatperson') and ("HAUSHALTSFUEHRER" = 'Ja'))
	)
;
KNR|KUNDENSTATUS|PERSONENART |HAUSHALTSFUEHRER|
---|------------|------------|----------------|
  2|aktiv       |Privatperson|Nein            |
  3|aktiv       |Firma       |Ja              |
  5|expired     |Privatperson|Ja              |
  7|expired     |Firma       |Ja              |

profect_bso
Padawan

That works perfectly in this special case. Problem is, that we need to derive subsets to represent those subsets of data especially when it comes to usage of a lot of satellite tables and satellite tables of the satellite table. Just think of shoveling immutable data into the outer query. Kinda ELT paradigm. Of course that could be solved in our application by join those subsets in the outer from clause and compare without in criteria - but extremely poor performance cause lot of IO necessary?! For me this is a bug/missing feature in the outer or criteria not supporting this valid sql due to following query works:

SELECT
    DISTINCT "EXAMPLE"."KNR"
FROM
    "EXAMPLE"."KUNDEN"
WHERE
	(
    	"KUNDEN"."KNR" IN ( 100000, 104465 )
    	AND
		"KUNDEN"."KNR" NOT IN ( 1000007, 1044748 )
    ) 
    OR 
    (  
    	"KUNDEN"."KNR" NOT IN ( 100000, 104465 )
    	AND
    	"KUNDEN"."KNR" IN ( 1000007, 1044748 ) 
    );

If there are not plans to support this - not a problem we just disable XOR for users in this module if Exasol comes to use. Just wanted to make sure you know this behaviour.

 

exa-Aleksandr
Team Exasol
Team Exasol

So you think that

with
query1 as(
	sub select with filter #1
)
, query2 as(
	sub select with filter #1
)
SELECT
	count(*)
FROM
	query1 q1
		full join query2 q2
		on q1.knr = q2.knr
WHERE
	nvl2(q1.knr, 1, 0) + nvl2(q2.knr, 1, 0) = 1
;

will have a bad performance and look for the following kind of syntax:

with
query1 as(
	sub select with filter #1
)
, query2 as(
	sub select with filter #1
)
SELECT
	count(*)
FROM
(
	SELECT
		*
	FROM
		query1
	
	xor
	
	SELECT
		*
	FROM
		query2
)
;

?

How do you solve such problem for other databases?