Python Exasol virtual schema now available on github

PeterK
Xpert

Hi @exa-Tobias & @drumcircle ,

I finally got around to open sourcing our python port of the Exasol Java Virtual Schema Adapter 

It's available at https://github.com/GrantStreetGroup/exasol-multi-tenant-virtual-schema-python

It supports most of the capabilities of the Java one including remote and local Exasol schemas.

Our main goal in porting it to python was to speed it up (it's ~ 1s faster per query).

We also added support for multi-schema multi-tenancy.  The idea is that if you have multiple identical schemas in Exasol (one per tenant) you can create a virtual schema on top of them using this adapter and they will appear as a single schema with all the data union-ed together (with an extra virtual column to identify the tenant). This is handy for queries that need to span multiple identical schemas.

Multi-tenancy Example:

Suppose you have 3 identical tenant schemas TNT_ONETNT_TWO, and TNT_THREE, each containing two tables TBL1 and TBL2. Now suppose you want to run queries that UNIONs data from TNT_ONE and TNT_THREE schemas. Usually you would write SQL like:

SELECT 'ONE' as client, col1, col2
FROM tnt_one.tbl1
JOIN tnt_one.tbl2
  ON tbl1.id = tbl2.id
UNION ALL
SELECT 'THREE', col1, col2
FROM tnt_three.tbl1
JOIN tnt_three.tbl2
ON tbl1.id = tbl2.id

This is cumbersome especially when the number of tenants is large.

Using a virtual schema you can automatically generate such queries by exposing all three schemas as though they were a single schema containing all the data and with each table containing an additional virtual column indicating which tenant the particular table row belongs to.

  1. Create a connection to the local Exasol instance where your schemas reside:

    CREATE CONNECTION LOCAL_EXASOL TO '...:8563' USER '...' IDENTIFIED BY ...;

     

  2. Create an adapter script containing the contents of adapter.py :

    --/
    CREATE OR REPLACE PYTHON ADAPTER SCRIPT py_vs_adapter AS
        %paste the contents of adapter.py here%
    /;

     

  3. Finally create your virtual schema:

    CREATE VIRTUAL SCHEMA tnt_all
        USING py_vs_adapter
        WITH CONNECTION_NAME = 'LOCAL_EXASOL'
             SCHEMAS = 'TNT_ONE,TNT_TWO,TNT_THREE'
             IS_LOCAL = 'TRUE'
             TENANT_PATTERN = 'TNT_(.+)' -- causes CLIENT column to contain ONE, TWO, THREE
             TENANT_COLUMN  = 'CLIENT'
    ;

     

The virtual schema TNT_ALL will now appear to have the same schema as the underying TNT_* schemas and if you query it you'll get a result set that is the UNION ALL of all three schemas.

Each table in the virtual schema will have an additional virtual column called CLIENT (or whatever you specify in the TENANT_COLUMN property.) You can specify criteria against any one of these virtual CLIENT columns in order to filter which clients you wish to query.

Your original query can now be written simply as:

SELECT tbl1.client, col1, col2
FROM tnt_all.tbl1
JOIN tnt_all.tbl2
  ON tbl1.id = tbl2.id
WHERE tbl1.client IN ('ONE','THREE')

 

For more details see the documentation in adapter.py

Enjoy,

Peter

 

1 REPLY 1

exa-Tobias
Team Exasol
Team Exasol

Hi @PeterK,

thank you for open sourcing your very cool implementation of a virtual schema!

This showcases how a multi-tenant implementation using a schema per tenant can be easily accessed via a virtual schema.