Generally speaking, NULL is not a special value, but it represents an undefined value. This article describes how to work with NULL values and the types of valid comparisons
View full article
MERGE is designed to use a small UPDATE table to affect a larger FACT table. This article explains how it works
View full article
Question Is the ROWID always assigned in an ascending order? Is it possible to determine the sequence of inserted records with the ROWID? Answer The ROWIDs of a table are managed by the DBMS. They are assigned in an ascending order per node and stay distinct within a table. For different tables, they could be the same. DML statements such as UPDATE, DELETE, TRUNCATE or MERGE might internally reorder data storage, invalidating and reassigning all the ROWIDs. Contrary to that, structural table changes such as adding a column leave the ROWIDs unchanged. Altering   distribution keys   of or   reorganizing   a table will certainly reassign ROWID values. Therefore the ROWID can't be used to determine the exact sequence or age of a record, it is designed to be a   short term   identifier for rows to be used for duplicate elimination. The ROWID pseudo column can only be used on   table   objects, not on views or subselects. If you try it on a view, you will receive an appropriate error message Additional References ROWID Syntax
View full article
This article explains the difference between temporary tables, views, and CTE's.
View full article
As Exasol only supports the transaction isolation level "SERIALIZABLE", this article looks at transactions and potential transaction conflicts.
View full article
Background Exasol R SDK offers functionality to interact with the Exasol database out of R programs. It extends the ORDBC and offers fast data transfer between Exasol and R ( exa.readData()   and   exa.writeData() ) and it makes it convenient to run parts of your R code in parallel in Exasol using R UDF functions behind the scenes ( exa.createScript() ) Example (Digit recognition) We will use a well-known machine learning algorithm for classifying handwritten digits. It is using a randomforest library, which is used to create decision trees for any kind of image. For a given training set and the actual result for each element, the library returns a decision tree that can be applied to other similar images, the test data. The data is stored pixel-wise (one value for each pixel). This How-To will focus on how to use the EXASolution R SDK and not focus on details of the algorithm or R. Further information on the algorithm and the data can be found   here. The example was chosen to show how easy it is to read or write data from or to EXASolution and how easy it is to deploy a given algorithm to an Exasol UDF. Prerequisites Exasol R SDK and randomForest needs to be installed on the used system. The attached files can be used to setup your Exasol database for the example: Run   01_DDL.sql      (Creates a schema, a table for train data set and a table for test data set) Place csv files ( train.csv     and   test.csv   ) at an appropriate place and change the path to those files in   02_IMPORT.sql   Run   02_IMPORT.sql     (Imports the train and test data set) How to use ExaSolution R SDK R implementation (reading the data sets from an EXASolution database) The attached file   03_RandomForest.r   shows how to use   exa.readData()   and   exa.writeData()   to interact with the database in R. As of Exasol 6.2.x, we needed to modify 1 line in 03_RandomForest.r in order to account for R version differences. Specifically, we are adding a parameter to the save command - version = 2. You are welcome to skip this modification and come back to revisit if you run into issues. Before: save(rf, file = "< path to rf.dat>") After: save(rf, version = 2, file = "< path to rf.dat>")  Additional installation notes for 03_RandomForest.r - installing needed packages. Continuing on, the general procedure is: Step 1. Connect to EXASolution and read the train and test data This requires an ODBC DSN (named "solo" in this example) Step 2. Read train and test data from the database and prepare them for further processing Test data was enriched by an image id Train data was enriched by a label showing the actual handwritten digit Step 3. Use randomForest() to create a decision tree and print prediction In this example, the decision forest is kept and tested on the test data Extract the predictions for test data and print them Step 4. Write decision tree to Exasol We will use a bucket called binary to store the decision forest Detailed information on EXABucket FS and buckets can be found in https://community.exasol.com/t5/environment-management/how-to-create-an-exabucketfs-service-and-bucket/ta-p/1515** This item does not apply to Exasol 6+ and is only left in for archival purposes.   Important Deprecation Notice:  In Version 5 it is not possible to store binary data in EXASOL database. One possibility is to serialize the decision forest and store it in a table. SInce VARCHAR type in EXASOL is limited to 2 million characters, the resulting string should not exceed 2 million characters.  curl -v -X PUT -T rf.dat http: //w:<write_pw>@<ip of one cluster node>:<port>/binary/rf.dat Step 5. Deploying a UDF The other file ( 05_PREDICTINDB.sql ) deploys the functionality to a UDF in EXASOL. We create a variadic UDF receiving an arbitrary amount of input parameters (all pixels) The output is the predicted digit. Load forest from the bucket Create a dataframe for all input columns (pixels) Use   predict()   and emit the predictions Step 6. Using UDF within EXASolution Now it is possible to use the deployed function within EXASolution like any other built-in function: select rf.predict(Model_str, PIXEL0,PIXEL1,PIXEL2,PIXEL3,PIXEL4,PIXEL5,PIXEL6,PIXEL7,PIXEL8,PIXEL9,PIXEL10,PIXEL11,PIXEL12,PIXEL13,PIXEL14,PIXEL15,PIXEL16,PIXEL17,PIXEL18,PIXEL19,PIXEL20,PIXEL21,PIXEL22,PIXEL23,PIXEL24,PIXEL25,PIXEL26,PIXEL27,PIXEL28,PIXEL29,PIXEL30,PIXEL31,PIXEL32,PIXEL33,PIXEL34,PIXEL35,PIXEL36,PIXEL37,PIXEL38,PIXEL39,PIXEL40,PIXEL41,PIXEL42,PIXEL43,PIXEL44,PIXEL45,PIXEL46,PIXEL47,PIXEL48,PIXEL49,PIXEL50,PIXEL51,PIXEL52,PIXEL53,PIXEL54,PIXEL55,PIXEL56,PIXEL57,PIXEL58,PIXEL59,PIXEL60,PIXEL61,PIXEL62,PIXEL63,PIXEL64,PIXEL65,PIXEL66,PIXEL67,PIXEL68,PIXEL69,PIXEL70,PIXEL71,PIXEL72,PIXEL73,PIXEL74,PIXEL75,PIXEL76,PIXEL77,PIXEL78,PIXEL79,PIXEL80,PIXEL81,PIXEL82,PIXEL83,PIXEL84,PIXEL85,PIXEL86,PIXEL87,PIXEL88,PIXEL89,PIXEL90,PIXEL91,PIXEL92,PIXEL93,PIXEL94,PIXEL95,PIXEL96,PIXEL97,PIXEL98,PIXEL99,PIXEL100,PIXEL101,PIXEL102,PIXEL103,PIXEL104,PIXEL105,PIXEL106,PIXEL107,PIXEL108,PIXEL109,PIXEL110,PIXEL111,PIXEL112,PIXEL113,PIXEL114,PIXEL115,PIXEL116,PIXEL117,PIXEL118,PIXEL119,PIXEL120,PIXEL121,PIXEL122,PIXEL123,PIXEL124,PIXEL125,PIXEL126,PIXEL127,PIXEL128,PIXEL129,PIXEL130,PIXEL131,PIXEL132,PIXEL133,PIXEL134,PIXEL135,PIXEL136,PIXEL137,PIXEL138,PIXEL139,PIXEL140,PIXEL141,PIXEL142,PIXEL143,PIXEL144,PIXEL145,PIXEL146,PIXEL147,PIXEL148,PIXEL149,PIXEL150,PIXEL151,PIXEL152,PIXEL153,PIXEL154,PIXEL155,PIXEL156,PIXEL157,PIXEL158,PIXEL159,PIXEL160,PIXEL161,PIXEL162,PIXEL163,PIXEL164,PIXEL165,PIXEL166,PIXEL167,PIXEL168,PIXEL169,PIXEL170,PIXEL171,PIXEL172,PIXEL173,PIXEL174,PIXEL175,PIXEL176,PIXEL177,PIXEL178,PIXEL179,PIXEL180,PIXEL181,PIXEL182,PIXEL183,PIXEL184,PIXEL185,PIXEL186,PIXEL187,PIXEL188,PIXEL189,PIXEL190,PIXEL191,PIXEL192,PIXEL193,PIXEL194,PIXEL195,PIXEL196,PIXEL197,PIXEL198,PIXEL199,PIXEL200,PIXEL201,PIXEL202,PIXEL203,PIXEL204,PIXEL205,PIXEL206,PIXEL207,PIXEL208,PIXEL209,PIXEL210,PIXEL211,PIXEL212,PIXEL213,PIXEL214,PIXEL215,PIXEL216,PIXEL217,PIXEL218,PIXEL219,PIXEL220,PIXEL221,PIXEL222,PIXEL223,PIXEL224,PIXEL225,PIXEL226,PIXEL227,PIXEL228,PIXEL229,PIXEL230,PIXEL231,PIXEL232,PIXEL233,PIXEL234,PIXEL235,PIXEL236,PIXEL237,PIXEL238,PIXEL239,PIXEL240,PIXEL241,PIXEL242,PIXEL243,PIXEL244,PIXEL245,PIXEL246,PIXEL247,PIXEL248,PIXEL249,PIXEL250,PIXEL251,PIXEL252,PIXEL253,PIXEL254,PIXEL255,PIXEL256,PIXEL257,PIXEL258,PIXEL259,PIXEL260,PIXEL261,PIXEL262,PIXEL263,PIXEL264,PIXEL265,PIXEL266,PIXEL267,PIXEL268,PIXEL269,PIXEL270,PIXEL271,PIXEL272,PIXEL273,PIXEL274,PIXEL275,PIXEL276,PIXEL277,PIXEL278,PIXEL279,PIXEL280,PIXEL281,PIXEL282,PIXEL283,PIXEL284,PIXEL285,PIXEL286,PIXEL287,PIXEL288,PIXEL289,PIXEL290,PIXEL291,PIXEL292,PIXEL293,PIXEL294,PIXEL295,PIXEL296,PIXEL297,PIXEL298,PIXEL299,PIXEL300,PIXEL301,PIXEL302,PIXEL303,PIXEL304,PIXEL305,PIXEL306,PIXEL307,PIXEL308,PIXEL309,PIXEL310,PIXEL311,PIXEL312,PIXEL313,PIXEL314,PIXEL315,PIXEL316,PIXEL317,PIXEL318,PIXEL319,PIXEL320,PIXEL321,PIXEL322,PIXEL323,PIXEL324,PIXEL325,PIXEL326,PIXEL327,PIXEL328,PIXEL329,PIXEL330,PIXEL331,PIXEL332,PIXEL333,PIXEL334,PIXEL335,PIXEL336,PIXEL337,PIXEL338,PIXEL339,PIXEL340,PIXEL341,PIXEL342,PIXEL343,PIXEL344,PIXEL345,PIXEL346,PIXEL347,PIXEL348,PIXEL349,PIXEL350,PIXEL351,PIXEL352,PIXEL353,PIXEL354,PIXEL355,PIXEL356,PIXEL357,PIXEL358,PIXEL359,PIXEL360,PIXEL361,PIXEL362,PIXEL363,PIXEL364,PIXEL365,PIXEL366,PIXEL367,PIXEL368,PIXEL369,PIXEL370,PIXEL371,PIXEL372,PIXEL373,PIXEL374,PIXEL375,PIXEL376,PIXEL377,PIXEL378,PIXEL379,PIXEL380,PIXEL381,PIXEL382,PIXEL383,PIXEL384,PIXEL385,PIXEL386,PIXEL387,PIXEL388,PIXEL389,PIXEL390,PIXEL391,PIXEL392,PIXEL393,PIXEL394,PIXEL395,PIXEL396,PIXEL397,PIXEL398,PIXEL399,PIXEL400,PIXEL401,PIXEL402,PIXEL403,PIXEL404,PIXEL405,PIXEL406,PIXEL407,PIXEL408,PIXEL409,PIXEL410,PIXEL411,PIXEL412,PIXEL413,PIXEL414,PIXEL415,PIXEL416,PIXEL417,PIXEL418,PIXEL419,PIXEL420,PIXEL421,PIXEL422,PIXEL423,PIXEL424,PIXEL425,PIXEL426,PIXEL427,PIXEL428,PIXEL429,PIXEL430,PIXEL431,PIXEL432,PIXEL433,PIXEL434,PIXEL435,PIXEL436,PIXEL437,PIXEL438,PIXEL439,PIXEL440,PIXEL441,PIXEL442,PIXEL443,PIXEL444,PIXEL445,PIXEL446,PIXEL447,PIXEL448,PIXEL449,PIXEL450,PIXEL451,PIXEL452,PIXEL453,PIXEL454,PIXEL455,PIXEL456,PIXEL457,PIXEL458,PIXEL459,PIXEL460,PIXEL461,PIXEL462,PIXEL463,PIXEL464,PIXEL465,PIXEL466,PIXEL467,PIXEL468,PIXEL469,PIXEL470,PIXEL471,PIXEL472,PIXEL473,PIXEL474,PIXEL475,PIXEL476,PIXEL477,PIXEL478,PIXEL479,PIXEL480,PIXEL481,PIXEL482,PIXEL483,PIXEL484,PIXEL485,PIXEL486,PIXEL487,PIXEL488,PIXEL489,PIXEL490,PIXEL491,PIXEL492,PIXEL493,PIXEL494,PIXEL495,PIXEL496,PIXEL497,PIXEL498,PIXEL499,PIXEL500,PIXEL501,PIXEL502,PIXEL503,PIXEL504,PIXEL505,PIXEL506,PIXEL507,PIXEL508,PIXEL509,PIXEL510,PIXEL511,PIXEL512,PIXEL513,PIXEL514,PIXEL515,PIXEL516,PIXEL517,PIXEL518,PIXEL519,PIXEL520,PIXEL521,PIXEL522,PIXEL523,PIXEL524,PIXEL525,PIXEL526,PIXEL527,PIXEL528,PIXEL529,PIXEL530,PIXEL531,PIXEL532,PIXEL533,PIXEL534,PIXEL535,PIXEL536,PIXEL537,PIXEL538,PIXEL539,PIXEL540,PIXEL541,PIXEL542,PIXEL543,PIXEL544,PIXEL545,PIXEL546,PIXEL547,PIXEL548,PIXEL549,PIXEL550,PIXEL551,PIXEL552,PIXEL553,PIXEL554,PIXEL555,PIXEL556,PIXEL557,PIXEL558,PIXEL559,PIXEL560,PIXEL561,PIXEL562,PIXEL563,PIXEL564,PIXEL565,PIXEL566,PIXEL567,PIXEL568,PIXEL569,PIXEL570,PIXEL571,PIXEL572,PIXEL573,PIXEL574,PIXEL575,PIXEL576,PIXEL577,PIXEL578,PIXEL579,PIXEL580,PIXEL581,PIXEL582,PIXEL583,PIXEL584,PIXEL585,PIXEL586,PIXEL587,PIXEL588,PIXEL589,PIXEL590,PIXEL591,PIXEL592,PIXEL593,PIXEL594,PIXEL595,PIXEL596,PIXEL597,PIXEL598,PIXEL599,PIXEL600,PIXEL601,PIXEL602,PIXEL603,PIXEL604,PIXEL605,PIXEL606,PIXEL607,PIXEL608,PIXEL609,PIXEL610,PIXEL611,PIXEL612,PIXEL613,PIXEL614,PIXEL615,PIXEL616,PIXEL617,PIXEL618,PIXEL619,PIXEL620,PIXEL621,PIXEL622,PIXEL623,PIXEL624,PIXEL625,PIXEL626,PIXEL627,PIXEL628,PIXEL629,PIXEL630,PIXEL631,PIXEL632,PIXEL633,PIXEL634,PIXEL635,PIXEL636,PIXEL637,PIXEL638,PIXEL639,PIXEL640,PIXEL641,PIXEL642,PIXEL643,PIXEL644,PIXEL645,PIXEL646,PIXEL647,PIXEL648,PIXEL649,PIXEL650,PIXEL651,PIXEL652,PIXEL653,PIXEL654,PIXEL655,PIXEL656,PIXEL657,PIXEL658,PIXEL659,PIXEL660,PIXEL661,PIXEL662,PIXEL663,PIXEL664,PIXEL665,PIXEL666,PIXEL667,PIXEL668,PIXEL669,PIXEL670,PIXEL671,PIXEL672,PIXEL673,PIXEL674,PIXEL675,PIXEL676,PIXEL677,PIXEL678,PIXEL679,PIXEL680,PIXEL681,PIXEL682,PIXEL683,PIXEL684,PIXEL685,PIXEL686,PIXEL687,PIXEL688,PIXEL689,PIXEL690,PIXEL691,PIXEL692,PIXEL693,PIXEL694,PIXEL695,PIXEL696,PIXEL697,PIXEL698,PIXEL699,PIXEL700,PIXEL701,PIXEL702,PIXEL703,PIXEL704,PIXEL705,PIXEL706,PIXEL707,PIXEL708,PIXEL709,PIXEL710,PIXEL711,PIXEL712,PIXEL713,PIXEL714,PIXEL715,PIXEL716,PIXEL717,PIXEL718,PIXEL719,PIXEL720,PIXEL721,PIXEL722,PIXEL723,PIXEL724,PIXEL725,PIXEL726,PIXEL727,PIXEL728,PIXEL729,PIXEL730,PIXEL731,PIXEL732,PIXEL733,PIXEL734,PIXEL735,PIXEL736,PIXEL737,PIXEL738,PIXEL739,PIXEL740,PIXEL741,PIXEL742,PIXEL743,PIXEL744,PIXEL745,PIXEL746,PIXEL747,PIXEL748,PIXEL749,PIXEL750,PIXEL751,PIXEL752,PIXEL753,PIXEL754,PIXEL755,PIXEL756,PIXEL757,PIXEL758,PIXEL759,PIXEL760,PIXEL761,PIXEL762,PIXEL763,PIXEL764,PIXEL765,PIXEL766,PIXEL767,PIXEL768,PIXEL769,PIXEL770,PIXEL771,PIXEL772,PIXEL773,PIXEL774,PIXEL775,PIXEL776,PIXEL777,PIXEL778,PIXEL779,PIXEL780,PIXEL781,PIXEL782,PIXEL783) from rf.test cross join rfmodels.models; PRED 0 0 3 0 3 3 0 0 1 7
View full article
Problem To reproduce certain problems, Exasol support may ask you to send DDL statements for required tables and views. At this point you have two choices: Start drilling around to find each and every required view / table etc. This may be a lot of work and may end up with some communication round trips in case you overlooked a dependency or two. Skip all that and just send us the full DDL for all schemas in your database instance. Just let us sort out what is needed and what is not. Both options are not optimal. Solution The attachment of this article contains a procedure script (Lua) that can create DDL statements for recursive dependencies of a view. The DDL are presented as a single-column result-set and are ready for copy/paste into a text editor (or EXAplus) for saving. Example Call Script call execute script meta.create_DDL( 'DUT' , 'TRUNK' ); Example output --DDL created by user SYS at 2017-11-14 09:44:59.554000 --========================================-- -- table dependencies -- --========================================-- CREATE SCHEMA "DUT"; CREATE TABLE "DUT"."TAB1"( "I" DECIMAL(18,0) IDENTITY NOT NULL, "J" DECIMAL(3,0) ); -- SYSTEM TABLE: SYS.EXA_METADATA --========================================-- -- function dependencies -- --========================================-- function func( param decimal(3) ) returns decimal(3) as begin return sqrt(param) * (select max(i) from dut.tab1); end / --========================================-- -- script dependencies -- --========================================-- CREATE LUA SCALAR SCRIPT "LUA_SCALAR" () RETURNS DECIMAL(18,0) AS function run() return decimal(10,18,0) end / --========================================-- -- view dependencies -- --========================================-- --> level 2 -- SYSTEM VIEW: SYS.CAT --> level 1 CREATE VIEW "DUT"."BRANCH" as ( select * from exa_metadata, cat ); -- final query/view: CREATE VIEW "DUT"."TRUNK" as ( select * from dut.tab1, dut.branch where func(j) > lua_scalar() ); Caution This script is work in progress and has only seen minimal testing so far. Things known not to work: Virtual Schemas – It is unlikely we would be able to clone your remote system anyway. Connections and IMPORT inside views – Like virtual schemas, it probably does not make much sense. Dependencies inside scripts – This is branded 'dynamic SQL' and our engine can not determine those dependencies without actually executing the script with specific input. SELECT dependencies inside functions – Just don't do that. Like with scripts, these dependencies to not show up in the system. If your model contains any of the above and it turns out to be relevant for reproduction of a problem, you might have to revert to "Skip all that" above. The "Copy Database" script in create-ddl-for-the-entire-database   may be of use then. Additional References https://www.exasol.com/support/browse/IDEA-359 https://community.exasol.com/t5/database-features/create-ddl-for-the-entire-database/ta-p/1417
View full article
This article talks about the use of UNION ALL at its benefits.
View full article
Problem JSON data that is stored in EXASOL tables can be accessed through UDFs. JSON documents can be accessed through path expressions. To understand what path expressions are, let us have a look at the following JSON document: { "name" : "Bob" , "age" : 37, "address" :{ "street" : "Example Street 5" , "city" : "Berlin" }, "phone" :[{ "type" : "home" , "number" : "030555555" },{ "type" : "mobile" , "number" : "017777777" }], "email" :[ "bob@example.com" , "bobberlin@example.com" ] } The JSON document contains five fields: “name” (a string), “age” (an integer), “address” (a document), “phone” (an array containing documents) and “email” (an array containing strings). Path expressions start with a dollar sign ($) representing the root document. The dot operator (.) is used to select a field of a document, the star in box brackets ( [*] ) selects and unnests all elements of an array. The following path expression finds all phone numbers: $.phone[*].number This expression is evaluated as follows: path step result $ { "name" : "Bob" , "age" : 37, "address" :{ "street" : "Example Street 5" , "city" : "Berlin" }, "phone" :[{ "type" : "home" , "number" : "030555555" },{ "type" : "mobile" , "number" : "017777777" }], "email" :[ "bob@example.com" , "bobberlin@example.com" ] } $.phone [{ "type" : "home" , "number" : "030555555" },{ "type" : "mobile" , "number" : "017777777" }] $.phone[*] { "type" : "home" , "number" : "030555555" } { "type" : "mobile" , "number" : "017777777" } $.phone[*].number "030555555" "017777777" Solution This solution presents a generic Python UDF json_table to access field values in JSON documents through path expressions. The json_table function has the following form: select json_table( <json string or column >, < path expression>, < path expression>, ... ) emits (< column_name > <data_type>, < column_name > <data_type>, ...) The JSON_TABLE UDF attached to this solution takes a VARCHAR containing JSON data as a first parameter and one or more path expressions: create or replace python scalar script json_table(...) emits(...) as The function can be called in a SELECT query. The EMITS clause has to be used to define the output column names and their data types. SELECT json_table('{ " name " : "Bob" , "age" : 37, "address" :{ "street" : "Example Street 5" , "city" : "Berlin" }, "phone" :[{ " type " : "home" , " number " : "030555555" },{ " type " : "mobile" , " number " : "017777777" }], "email" :[ "bob@example.com" , "bobberlin@example.com" ]} ',' $.phone[*]. number ') EMITS (phone VARCHAR (50)); When the JSON data is stored in a table, the first parameter of JSON_TABLE contains the column name: CREATE TABLE example_table (column_a INT , json_data VARCHAR (2000000)); -- INSERT INTO example_table VALUES (1, '{ " name ": "Bob",…' ); ( as above) SELECT json_table(json_data, '$.phone[*]. number ' ) EMITS (phone VARCHAR (50)) FROM example_table; It is possible to use both the json_table UDF and normal columns of the table within the SELECT clause: SELECT column_a, json_table(json_data, '$.phone[*]. number ' ) EMITS (phone VARCHAR (50)) FROM example_table; When a row in the input table consists of n phone numbers within the JSON column, there will be n output rows for that tuple. The value of column_a is constant for all those rows: COLUMN_A PHONE 1 030555555 1 017777777 The following table shows some more valid path expressions: path result $.name "Bob" $.address { "street" : "Example Street 5" , "city" : "Berlin" } $.address.city "Berlin" $.email [ "bob@example.com" , "bobberlin@example.com" ] $.email[*] "bob@example.com" "bobberlin@example.com" This query converts the JSON data into column values: SELECT json_table(json_data, '$. name ' , '$.age' , '$.address.city' ) EMITS ( name VARCHAR (500), age INT , city VARCHAR (500)) FROM example_table; NAME AGE CITY Bob 37 Berlin When unnesting an array, the values from different levels stay the same for every array element: SELECT json_table(json_data, '$. name ' , '$.age' , '$.address.city' , '$.email[*]' ) EMITS ( name VARCHAR (500), age INT , city VARCHAR (500), email VARCHAR (500)) FROM example_table; NAME AGE CITY EMAIL Bob 37 Berlin bob@example.com Bob 37 Berlin bobberlin@example.com The result of unnesting more than one array is the cross product of those arrays: SELECT json_table(json_data, '$. name ' , '$.age' , '$.address.city' , '$.email[*]' , '$.phone[*]. type ' , '$.phone[*]. number ' ) EMITS ( name VARCHAR (500), age INT , city VARCHAR (500), email VARCHAR (500), phone_type VARCHAR (50), phone_number VARCHAR (50)) FROM example_table; NAME AGE CITY EMAIL PHONE_TYPE PHONE_NUMBER Bob 37 Berlin bob@example.com home 030555555 Bob 37 Berlin bob@example.com mobile 017777777 Bob 37 Berlin bobberlin@example.com home 030555555 Bob 37 Berlin bobberlin@example.com mobile 017777777 Details and limitations: When the JSON input is not a single document but an array, the elements of the array can be accessed via $ [*] It is recommended to use the correct data types in the EMITS clause. Otherwise, casting is done which can lead to type-conversion errors. When accessing non-atomic values, i.e. arrays without unnesting them, or documents, they are returned as a VARCHAR containing the result in JSON format. Accessing multi-dimensional arrays is not supported. This means, at most one [*] can be used in a path expression. Additional References https://docs.exasol.com/advanced_analytics/accessing_json_data_udfs.htm https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/json_value.htm https://community.exasol.com/t5/data-science/parsing-json-data-with-python/ta-p/1247 https://community.exasol.com/t5/discussion-forum/sneakpreview-on-version-7-json-functions/m-p/1710
View full article
Background Cause and Effect: Since EXASolution transaction isolation level is SERIALIZABLE and newly created transactions are automatically scheduled after finished transactions, it is possible that WAIT FOR COMMITS occur for pure read transactions (consisting of SELECT statements, only).  Explanation How to reproduce: Three different connections (having AUTOCOMMIT off) are needed to reproduce this situation: Example 1: If a long running transaction (Tr1) reads object A and writes object B (e.g. long running IMPORT statements) and a second transaction (Tr2) writes object A and commits in parallel, Tr2 is scheduled after Tr1. AfterTr2 is commited all new transactions are scheduled after it. If such a transaction wants to read object B it has to wait for the commit of Tr1. Transaction 1 Transaction 2 Transaction 3 Comment select * from tab1;       insert into tab2 values 1;       – transaction remains opened         insert into WFC.tab1 values 1;   Transaction 1 < Transaction 2   commit;         commit; Starts a new transaction (Transaction 2 < Transaction 3)     select * from tab2; This statement ends up in WAIT FOR COMMIT , waiting for Transaction 1   Example 2: The same situation may occur if you query system tables while SqlLogServer is performing one of its tasks (e.g. "DB size task" determining the database size). The following example describes this situation: Transaction 1 LogServer Transaction 3 Comment select * from EXA_DB_SIZE_LAST_DAY;       insert into tab1 values 1;       – transaction remains opened         – DB size task (writes EXA_DB_SIZE_LAST_DAY)   Transaction 1 < LogServer transaction, the task is executed every 30 minutes (0:00, 0:30, 1:00, 1:30, ...)     commit; Starts a new transaction (LogServer transaction 2 < Transaction 3)     select * from EXA_DB_SIZE_LAST_DAY; This statement end up in WAIT FOR COMMIT Solution Currently, the only solution to this is to break up Transaction 1 into multiple transactions by performing a COMMIT or ROLLBACK after the initial read access. However, things may get more complicated when the read/write operation is concentrated within a single statement (ie. MERGE or INSERT from SELECT). In the latter case it has proven helpful to 'outsource' the reading part by using IMPORT as a subselect to fetch required data through a separate transaction... Additional References https://community.exasol.com/t5/database-features/transaction-system/ta-p/1522 https://community.exasol.com/t5/database-features/filter-on-system-tables-and-transaction-conflicts/ta-p/1232
View full article
This article describes the difference between DELETE and REORGANIZE
View full article
This article explains query performance using profiling.
View full article
Problem We want to replace all NULL values within a certain table with the default value specified for the according column. And we don't want to create the necessary update statements manually. Solution A small metadata-driven procedure script (Lua) that issues the required update statements. (See Attachment) Notes... ...on transactions The script performs a rollback after the metadata request, to avoid a read-write conflict scenario. performs all updates within a single transaction. will   not abort   when an update on one of the columns fails. performs a commit when all columns have been handled, regardless of any errors encountered. ...on column selection The script   includes   all columns that do have a DEFAULT value set. It   excludes   all columns with a NOT NULL constraint (ignoring the actual state of the constraint). Obviously, such a column can not contain any NULL values that need updating. ...on row selection Due to Exasol's memory management and data processing, the script handles each column separately. This minimizes both the amount of memory required for processing and the amount of data blocks being written. The script does   not   contain any delta functionality, it will process all rows of the table each time it is called. Installation Just create the script in any schema you like (CREATE SCRIPT permission required). It does not have any dependencies. create or replace /* procedure */ script REPLACE_NULL( schema_name , table_name ) ... Usage When calling the script, it expects two parameters: A schema name and a table name: execute script REPLACE_NULL( 'my schema ' , 'my table ' );   💡 Both schema and table name are expected as   string   and will be case-sensitive. Example open schema SR9000; -- Rows affected: 0 create table Invoice( invoice_id int , invoice_date date default date '2017-01-01' ); -- Rows affected: 0 insert into Invoice values (1, null ), (2, null ), (3, '2017-02-01' ); -- Rows affected: 3 execute script REPLACE_NULL( 'SR9000' , 'Invoice' ); -- [43000] " No columns found for " SR9000 "." Invoice "" caught in script " SR9000 "." REPLACE_NULL" at line 23 ( Session : 1585944483210400591) ... yes. We created the table using a regular identifier, so it ended up as uppercase... execute script REPLACE_NULL( 'SR9000' , 'INVOICE' ); This returns: COLUMN_NAME COLUMN_TYPE COLUMN_DEFAULT UPDATE_RESULT INVOICE_DATE DATE TO_DATE('2017-01-01','YYYY-MM-DD') 2 rows updated
View full article
IMPORT of a timestamp column fails, because 0000-00-00 00:00:0000, which is a valid timestamp in MySQL, is invalid as a timestamp in Exasol.
View full article
This article gives an example of how Exasol's Skyline Feature can be used
View full article
Background This solution is currently work in progress, but published anyway       Aggregation with KEEP FIRST/LAST workaround is given in this solution. We will try to add a workaround for analytical functions with KEEP FIRST/LAST All Joins in our examples are not taking care about potential NULL values. For NULL values a join condition using NVL or similar should be used No warranty for correctness How to transform the following select to something supported in EXASOL?       Note: The following transformation is an advanced topic. Understanding how to rewrite the query will need a considerable amount of time. SELECT <aggregate_function>(<aggregate_args>) KEEP ( DENSE_RANK FIRST / LAST ORDER BY <order_expr>) <other_select_list_exprs> FROM <from_clause> WHERE <where_clause> GROUP BY <group_by_cols> HAVING <having_clause> DENSE_RANK KEEP FIRST/LAST calculate the <aggregate_function> only on those values of <aggregate_args> that have the highest / lowest values in <order_expr>. In order to rewrite the statement we need to calculate: Step 1 <other_select_list_exprs> with the original GROUP BY Step 2 <aggregate_function>(<aggregate_args>) for a GROUP BY <group-by_cols>,<order_expr>. This computes the aggregates for each value of order_expr, so among others - which will be discarded - there is the computation for the FIRST and LAST value Step 3 The minimum/maximum value of <order_expr> for the original GROUP BY. This is needed to determine which subgroup to select from the finer aggregates. The values for the original select from (1) will then be enriched with values from (2) by a join on the GROUP BY-columns of the original select together with the attribute for selecting exactly the <aggregate_function> value for the subaggregation for the FIRST / LAST value. So basically, the rewriting looks like this: WITH original_aggregation AS ( SELECT <other_select_list_exprs>, min / max (<order_expr>) AS first_last_selector, <group_by_cols> FROM <from_clause> WHERE <where_clause> GROUP BY <group_by_cols> HAVING <having_clause> ), subaggregate_for_first_last AS ( SELECT <aggregate_function>(<aggregate_args>) AS afk, <group_by_cols>, <order_expr> FROM <from_clause> WHERE <where_clause> GROUP BY <group_by_cols>, <order_expr> HAVING <having_clause> -- not needed and not accurate, but filters anyway ) -- Now the join SELECT sub_agg.afk, orig_agg.<other_select_list_expr> FROM original_aggregation AS orig_agg LEFT JOIN subaggregate_for_first_last AS sub_agg ON orig_agg.<group_by_cols> = sub_agg.<group_by_cols> AND orig_agg.first_last_selector = sub_agg.<order_expr> To get the computation faster, we don't use a separate select for 1) and 2), but combine the calculation in a single select with different GROUPING SETs and distinguish them by the different values of the GROUPING-function (see manual for more details): WITH all_aggregations AS ( SELECT <aggregate_function>(<aggregate_args>) AS afk, <other_select_list_exprs>, min / max (<order_expr>) AS first_last_selector, <group_by_cols>, <order_expr>, GROUPING (<group_by_cols>, <order_expr>) AS group_selector FROM <from_clause> WHERE <where_clause> GROUP BY GROUPING SETS ((<group_by_cols>), (<group_by_cols>, <order_expr>)) HAVING <having_clause> ), original_aggregation AS ( SELECT <other_select_list_exprs>, first_last_selector, <group_by_cols>, FROM all_aggregations WHERE group_selector=< GROUPING value for aggregate on <group_by_cols> only > ), subaggregate_for_first_last AS ( SELECT afk, <group_by_cols>, <order_expr> FROM all_aggregations WHERE group_selector=< GROUPING value for aggregate on <group_by_cols> and <order_expr>> ) -- Now the join SELECT sub_agg.afk, orig_agg.<other_select_list_expr> FROM original_aggregation AS orig_agg LEFT JOIN subaggregate_for_first_last AS sub_agg ON orig_agg.<group_by_cols> = sub_agg.<group_by_cols> AND orig_agg.first_last_selector = sub_agg.<order_expr> Example of a simple transformation -- Sample data table CREATE TABLE T (i int , gc int , j int , k int ); INSERT INTO T VALUES (1, 1, 3, 1), (2, 1, 5, 1), (3, 1, 7, 2), (4, 1, 9, 3), (5, 1, 11, 3), (6, 2, 7, 1), (7, 2, 3, 2); -- This select SELECT max (i) AS col, sum (j) KEEP ( DENSE_RANK FIRST ORDER BY k ) kdrf FROM t GROUP BY gc; -- With result -- col kdrf -- 5 8 (=3+5, values 7, 9 and 11 are not aggregated) -- 7 7 (=7, value 3 is not aggregated) -- Is transformed to WITH all_aggregations AS ( SELECT sum (j) AS afk, max (i) as col, min ( k ) AS first_last_selector, gc, k , GROUPING (gc, k ) AS group_selector FROM t GROUP BY GROUPING SETS ((gc), (gc, k )) ), original_aggregation AS ( SELECT col, first_last_selector, gc FROM all_aggregations WHERE group_selector=1 ), subaggregate_for_first_last AS ( SELECT afk, gc, k FROM all_aggregations WHERE group_selector=0 ) -- Now the join SELECT orig_agg.col, sub_agg.afk as kdrf FROM original_aggregation AS orig_agg LEFT JOIN subaggregate_for_first_last AS sub_agg ON orig_agg.gc = sub_agg.gc AND orig_agg.first_last_selector = sub_agg. k ; Additional Notes What to do with aggregated select (no GROUP BY) The case that KEEP DENSE RANK FIRST/LAST is used in an aggregated select without GROUP BY is a simpler special case of the case with GROUP BY. In that case we add a   GROUP BY GROUPING SETS (), (<order_expr>)   and join on only on   first_last_selector=order_expr . Essentially everything is the same in that case, only we can omit   <group_by_cols>   whereever they were used. Transformation for multiple FIRST/LAST functions Note that in the case of a single KEEP DENSE_RANK FIRST/LAST function the value of the group_selector in the WITH-clause subaggregate_for_first_last is 0. However, we can do the same transformation for multiple KEEP DENSE_RANK FIRST/LAST functions. In the general case we need a separate subaggregation for each KEEP DENSE_RANK FIRST/LAST function - which means another GROUPING SET (<group_by_cols>, <new_order_expr>), another WITH-clause with selection of the correct GROUPING and another LEFT JOIN on <group_by_cols> and <first_last_selector> for that <new_order_expr>. A few tweaks can be applied to the general case in some situations: if multiple KEEP DENSE_RANK FIRST/LAST functions share the same <order_expr> their <aggregate_function>-values can be computed in the same subaggregate_select (and same GROUPING SET), but might still need a sepate LEFT JOIN for each if they further have the same value of FIRST/LAST and have the same ORDER BY direction a single LEFT JOIN on this select is enough to get all the correct values. Example of a transformation with several FIRST/LAST Matrix for first_last_selector The following expressions need to be used as first_last_selector depending on FIRST/LAST and ORDER BY clause: ORDER BY FIRST LAST ASC   [NULLS LAST] min(<order_expr>) CASE WHEN COUNT(<order_expr>)!=COUNT       THEN NULL ELSE max(<order_expr>) END DESC   [NULLS FIRST] CASE WHEN COUNT(<order_expr>)!=COUNT       THEN NULL ELSE max(<order_expr>) END min(<order_expr>) ASC NULLS FIRST CASE WHEN COUNT(<order_expr>)!=COUNT       THEN NULL ELSE min(<order_expr>) END max(<order_expr>) DESC NULLS LAST max(<order_expr>) CASE WHEN COUNT(<order_expr>)!=COUNT       THEN NULL ELSE min(<order_expr>) END Multiple <order_by_expr>s If there are multiple <order_expr>s for a single KEEP DENSE_RANK FIRST/LAST function things get more complicated: In that case we can try to combine them all in a single value by concatenating them (and padding, such that order of the values is the same as for the concatenated values. This will work as long as all columns have the same ORDER BY direction Otherwise we might need to add a subselect around the <from_clause> containing <from>.* and RANK() OVER (ORDER BY <order_by_expr>s) as rnk. Then we can use MIN(CASE WHEN rnk=1 THEN <order_by_expr> ELSE NULL) for each <order_by_expr> to get multiple first_last_selectors for the LEFT JOIN. Of course, this would need to be done for each KEEP DENSE_RANK FIRST/LAST function with multiple <order_by_expr>s. Additional References https://www.exasol.com/support/browse/IDEA-65
View full article
This article shows you how you can create functions to work with the Bing Maps Tile System
View full article
Problem As documented (see   https://docs.exasol.com/sql/select.htm) the connect by cycle detection works different than Oracles implementation. We recommend to change your logic accordingly, however if you want to reduce Exasol results so that they match the Oracle results you can do this using this guide. Diagnosis Example for cycle detection Here is a small example to explain the difference between Exasol and Oracle. Given this table: insert into employees values ( 'B' , 1, 2); insert into employees values ( 'E' , 3, 4); insert into employees values ( ' C ' , 4, 2); insert into employees values ( ' C ' , 4, 5); insert into employees values ( ' A ' , 2, 99); insert into employees values ( ' A ' , 2, 3); insert into employees values ( 'D' , 100, 1); insert into employees values ( 'F' , 99, NULL ); You want to get the chain of command for Peters manager. Peter is directly managed by a 3rd line manager who is beneath the CEO. However in this company the 3rd line manager is also managed by the 1st line manager, which generates a cycle. When you run this query in Exasol you get the following result: SELECT sys_connect_by_path( name , '-' ) MANAGER_PATH, CONNECT_BY_ISCYCLE ISCYCLE, CONNECT_BY_ISLEAF ISLEAF, level l from employees start with manages=99 connect by nocycle prior id = manages ORDER BY level ;   MANAGER_PATH ISCYCLE ISLEAF L -A 0 0 1 -A-B 0 0 2 -A-C 0 0 2 -A-B-D 0 1 3 -A-C-E 0 0 3 -A-C-E-A 0 0 4 -A-C-E-A-B 0 0 5 -A-C-E-A-C 1 1 5 -A-C-E-A-B-D 0 1 6 The reason is that Exasol detects cycles in the current row. As the row with id 1 appears the second time in the path, the cycle is detected, see path .A-C-E-A-C Oracle does not detect rows in the current row, but checks if the child row has a cycle. For the same query the result is: MANAGER_PATH ISCYCLE ISLEAF L -A 0 0 1 -A-B 0 0 2 -A-C 0 0 2 -A-B-D 0 1 3 -A-C-E 0 0 3 The cycle is detected in row .5.1.3, because row 3 has a child (row 4) which has a child that is also an ancestor (row 1). This is kind of a double look ahead. Solution This Solution describes: how to emulate Oracles behaviour in Exasol If you want the same behaviour in Exasol you have to backtrack two levels from the cycle and remove everything that is a child of the row that oracle marks as cycle. For doing this you can use the following pattern. Your query has these parts: SELECT <original SELECT list> FROM <original TABLE OR VIEW/subselect> <original CONNECT BY clause> <original ORDER BY clause> Apply this pattern: WITH base_table AS ( SELECT ROW_NUMBER () OVER ( ORDER BY rowid ) AS row_num, <original SELECT list> FROM <original TABLE OR VIEW /subselect> ), add_cols_for_cycle_detection AS ( SELECT <original SELECT list> --extra columns oracle like cycle detection sys_connect_by_path(row_num, '.' ) AS cd_scbp, row_num, PRIOR row_num cd_prior_row_num, CONNECT_BY_ISCYCLE AS cd_cycle from base_table <original CONNECT BY clause> ), parent_of_cycle AS ( SELECT cd_prior_row_num FROM add_cols_for_cycle_detection WHERE cd_cycle=1 ), ora_cycle_start AS ( SELECT cd_scbp, cd_prior_row_num FROM add_cols_for_cycle_detection WHERE row_num IN ( SELECT cd_prior_row_num FROM parent_of_cycle) ), ora_like_cb AS ( SELECT * FROM add_cols_for_cycle_detection WHERE NOT EXISTS ( SELECT 1 FROM ora_cycle_start WHERE cd_scbp= SUBSTRING (add_cols_for_cycle_detection.cd_scbp,0,len(cd_scbp)) ) ) SELECT <original SELECT list> FROM ora_like_cb <original ORDER BY clause>; Applied to our example query the resulting query is: WITH base_table AS ( SELECT ROW_NUMBER () OVER ( ORDER BY rowid ) AS row_num, name , id, manages FROM employees ), add_cols_for_cycle_detection AS ( SELECT sys_connect_by_path( name , '-' ) MANAGER_PATH, CONNECT_BY_ISCYCLE ISCYCLE, CONNECT_BY_ISLEAF ISLEAF, LEVEL l, --extra columns oracle like cycle detection sys_connect_by_path(row_num, '.' ) AS cd_scbp, row_num, PRIOR row_num cd_prior_row_num, CONNECT_BY_ISCYCLE AS cd_cycle from base_table start with manages=99 connect by nocycle prior id = manages ), parent_of_cycle AS ( SELECT cd_prior_row_num FROM add_cols_for_cycle_detection WHERE cd_cycle=1 ), ora_cycle_start AS ( SELECT cd_scbp, cd_prior_row_num FROM add_cols_for_cycle_detection WHERE row_num IN ( SELECT cd_prior_row_num FROM parent_of_cycle) ), ora_like_cb AS ( SELECT * FROM add_cols_for_cycle_detection WHERE NOT EXISTS ( SELECT 1 FROM ora_cycle_start WHERE cd_scbp= SUBSTRING (add_cols_for_cycle_detection.cd_scbp,0,len(cd_scbp)) ) ) SELECT MANAGER_PATH, ISCYCLE, ISLEAF, L FROM ora_like_cb ORDER BY ISLEAF; The result is the same as in Oracle: MANAGER_PATH ISCYCLE ISLEAF L -A 0 0 1 -A-B 0 0 2 -A-C 0 0 2 -A-C-E 0 0 3 -A-B-D 0 1 3 If you also want CONNECT_BY_ISCYLE to work like in Oracle, you have to extend the pattern by another CTE ora_cycle: WITH base_table AS ( SELECT ROW_NUMBER () OVER ( ORDER BY rowid ) AS row_num, name , id, manages FROM employees ), add_cols_for_cycle_detection AS ( SELECT sys_connect_by_path( name , '-' ) MANAGER_PATH, CONNECT_BY_ISCYCLE ISCYCLE, CONNECT_BY_ISLEAF ISLEAF, LEVEL l, --extra columns oracle like cycle detection sys_connect_by_path(row_num, '.' ) AS cd_scbp, row_num, PRIOR row_num cd_prior_row_num, CONNECT_BY_ISCYCLE AS cd_cycle from base_table start with manages=99 connect by nocycle prior id = manages ), parent_of_cycle AS ( SELECT cd_prior_row_num FROM add_cols_for_cycle_detection WHERE cd_cycle=1 ), ora_cycle_start AS ( SELECT cd_scbp, cd_prior_row_num FROM add_cols_for_cycle_detection WHERE row_num IN ( SELECT cd_prior_row_num FROM parent_of_cycle) ), ora_like_cb AS ( SELECT * FROM add_cols_for_cycle_detection WHERE NOT EXISTS ( SELECT 1 FROM ora_cycle_start WHERE cd_scbp= SUBSTRING (add_cols_for_cycle_detection.cd_scbp,0,len(cd_scbp)) ) ), ora_cycle AS ( SELECT ora_like_cb.*, decode(ora_cycle_start.cd_prior_row_num, NULL , 0, 1) AS cyc FROM ora_like_cb LEFT JOIN ora_cycle_start ON ora_like_cb.row_num=ora_cycle_start.cd_prior_row_num ) SELECT MANAGER_PATH, ISCYCLE, ISLEAF, l, cyc FROM ora_cycle; Result: MANAGER_PATH ISCYCLE ISLEAF L CYC -A 0 0 1 0 -A-B 0 0 2 0 -A-C 0 0 2 0 -A-B-D 0 1 3 0 -A-C-E 0 0 3 1
View full article
This article shows you how to calculate the backup duration from the database
View full article
This article shows what happens behind the scenes when JOINing on a DECIMAL datatype.
View full article