iproc() in WHERE clause

mwellbro
Xpert

Hi all,

just trying to wrap my head around something: if I were to filter on iproc() at the end of a complex statement, how exactly is this applied to the underlying table ?

It does not seem to say "give me all rows from each table from node = iproc() " because in that case some rows from various joins would end up missing.

However the profile seems to indicate that the filtering is indeed done early on so also does not seem to be the case that all data is processed and only the result of "iproc()=x" is 
handed to the application.

Can someone shed some light on how an "WHERE iproc()=x" is applied ?

Cheers,
Malte

2 REPLIES 2

exa-Uwe
Moderator
Moderator

Hi Malte,

what is it that you want to achieve with "WHERE iproc()=x"? I can hardly think of a sensible use case.

Say you have a 2 node cluster. Then WHERE iproc()=0 filters on rows residing on the first node.

The only use case for using iproc (except checking for row distribution) I have seen so far is a GROUP BY iproc() like in this clip from our Advanced Analytics course:

https://www.youtube.com/watch?v=8m33kwokXaU&t=2s

Best regards

Uwe

mwellbro
Xpert

Hi Uwe,

the idea here is to force a certain ETL tool to create multiple TCP connections to Exasol ( ideally 1 per Node ) and achieve more parallelism
in the non-Exasol part of the processing.
This also allows for higher network throughput and since in this particular case I´m merely interested in getting all rows from the respective SQL to
the application and I don´t need to worry about which result row is provided by which database/TCP-session.
I´m just not entirely sure that I´m not overlooking some processing behavior that would damage the result set but so far all my tests seem ok - hence the question.

Cheers,
Malte