19-11-2020
05:57 PM
- last edited on
20-11-2020
08:58 AM
by
exa-Nico
Hi Nico...
Yes, I having a little trouble loading data to Exasol.
We have several data files, txt or csv, the issue is that the first line of every file is different to the rest... i mean, I have a data file, but the first line is the control data with 5 or 6 columns, and from from second row and on, is the data.... I have no problem loading data from second rows on, but i need to read the first row and only the first row of every file, to populate a control table on exasol.
Can you help me with this.??
Best Regards
Alejandro
20-11-2020 09:00 AM
Hey @adiazv so you only want to insert the first row of data? Any you want to ignore the rest of the file?
Does the rest of the file look exactly the same as the first row?
20-11-2020 01:36 PM
Hi @exa-Nico yes, I only want to insert into a control table the first row of data file, from the second row and on, go to data table...
The File has the first row different from the rest rows.... like:
1 Row :DATAFILE_01_20201120.csv, 2020-11-20, 5, 0
2 Row: 2020-11-20,"ABCD",123,456,"COUNTRY",13413,1324134,2432435,2345245,2345245,23452435
3 Row: 2020-11-20,"ABCD",123,456,"COUNTRY",13413,1324134,2432435,2345245,2345245,23452435
4 Row: 2020-11-20,"ABCD",123,456,"COUNTRY",13413,1324134,2432435,2345245,2345245,23452435
5 Row: 2020-11-20,"ABCD",123,456,"COUNTRY",13413,1324134,2432435,2345245,2345245,23452435
6 Row: 2020-11-20,"ABCD",123,456,"COUNTRY",13413,1324134,2432435,2345245,2345245,23452435
I can load the data (from row 2 to 6) to data table, but i need to load the row 1 and only the row 1 to a control table..
Best Regards
Alejandro
20-11-2020 01:47 PM
You could use the "ERRORS INTO" clause in the IMPORT statement to do this. Since the columns are different, for the control table, you can specify to import the file and say that any errors will be put into a table instead of the entire statement failing. This will lead to only one row being inserted - the header - because the rest have too many columns. Here's an example:
CREATE TABLE TEST.CONTROL_TABLE (CSV_NAME VARCHAR(2000), DATE_INSERTED DATE, NUMBER_OF_ROWS INT, EXTRA_COLUMN INT);
CREATE TABLE TEST.DATA_TABLE (C1 DATE, C2 VARCHAR(20000), C3 INT, C4 INT, C5 VARCHAR(20000), C6 INT, C7 INT, C8 INT, C9 INT, C10 INT, C11 INT);
IMPORT INTO TEST.CONTROL_TABLE FROM LOCAL CSV FILE 'C://Temp/test.csv' row separator = 'CRLF' errors into test.error_table replace reject limit unlimited;
IMPORT INTO TEST.DATA_TABLE FROM LOCAL CSV FILE 'C://Temp/test.csv' row separator = 'CRLF' skip=1;
select * from test.control_table;
select * from test.data_table;
The IMPORT documentation gives you more information about how to use it. Basically, you will then perform 2 IMPORTs using the same file, one for the control table, and one for the data. Since you don't care about the errors, you can just delete or ignore the error_table then and all of the entries.
Hope it helps!
01-12-2020 01:41 PM
Hi @exa-Nico, thats is a great idea.... I have a UDF that do the load... so, i have to modify it a little to load all errors on different tables....
The UDF from @Gallus is good too, but like say @exa-Chris maybe @Gallus can share that.
Thanks to all
And I think that is a good idea, modify the Exasol Command "IMPORT" to allow import only few rows of data, like "SKIP=1 or 2 or n", maybe "READ=1, or 10, or 1000"
Regards
Alejandro
20-11-2020 02:57 PM
@adiazv ,
Can you try something like below.
From File_1.txt, create another file File_1_header.txt by running a shell script, then upload both files separately.
Thanks,
Sreekanth
24-11-2020 03:23 PM
Hi @skarangi , I have already think in that solution, but I think maybe exist another solution directly on Import command... I mean directly on Exasol Commands.
A new question, is there an option to use "IO." LUA commands in Exasol UDF? or Execute and Powershell Script or SH script directly from UDF?
Regards
Alejandro
24-11-2020 03:43 PM
Hi ,Yes from PYTHON UDF this can be achieved. I believe LUA can also be used but never tried before.
29-11-2020 12:32 PM
Yes, we do something similar, we actually have >2'000 small files (10-50 rows) which we want to load all into the same table, each of the files has 2 header rows(1 row column names, 2 row datatype definition). The files reside on an sftp-server.
We have built a python UDF, which merges the different files together, and cuts off the two first lines of each file. In order to do so we had to use some sftp-python libraries, which were not part of the Exasol-Python standard. As a consequence we had to build our own python container
30-11-2020 09:05 AM - edited 30-11-2020 09:13 AM
Hi Gallus, are you allowed to share that Python Container somewhere? Rgds Christian
Then it's time to become part of a unique family! Discover helpful tips and support other Community members with your knowledge.
Sign In