Fully managed cloud. 30-day FREE full featured trial. Start Now
cancel
Showing results for 
Search instead for 
Did you mean: 

Facing a 'syntax error' upon trying to write a data into the table using Spark

LordBertson
Contributor

Hi all, I am facing this issue attempting to write a DataFrame to a table using Exasol JDBC.

Upon the following statement,

 

test_table.write.format('jdbc').options(
    url='jdbc:exa:'+exasol_adress, 
    driver='com.exasol.jdbc.EXADriver', 
    dbtable='TEST.CSV_TABLE', 
    user='sys', 
    password='exasol', 
    create_table='true').mode('append').save()

 

following error is thrown:

 

java.sql.SQLSyntaxErrorException: syntax error, unexpected IDENTIFIER_LIST_, expecting ',' or ')'

 

It seems to me, as if this was translated into SQL incorrectly by a driver resulting in a syntax error.

As for the setup that might be relevant, I am using Exasol JDBC driver 7.0.4, with Spark 3.0.1.

Please advise.

1 ACCEPTED SOLUTION

Accepted Solutions

exa-Nico
Community Manager
Community Manager

What happens if you create the table in advance (using varchars) and then set create_table='false'?

Sports Enthusiast. Database Guy. Member of Team Exasol.
Having trouble? Just let me know!

Please Give Kudos if you find this post useful and also mark this as Accepted Solution if this is the required solution

View solution in original post

6 REPLIES 6

exa-Chris
Exasol Alumni

Hi @LordBertson ,
I will nudge some people to get an answer for you.
Rgds
@exa-Chris 

LordBertson
Contributor

Hi @exa-Chris 

thanks. I am doing some tinkering of my own and I am starting to suspect it's caused by some unescaped quotes that eat up a comma somewhere.

If I arrive at some results I'll post it here.

Best Regards
Peter

LordBertson
Contributor

Okay, it's weirder than that, it attempts to execute a following query to 

CREATE TABLE TESTS.CSV_TABLE ("ID" TEXT , 
                               "NAME" TEXT , 
                               "LASTNAME" TEXT )

Where Exasol does not know the type TEXT

exa-Nico
Community Manager
Community Manager

What happens if you create the table in advance (using varchars) and then set create_table='false'?

Sports Enthusiast. Database Guy. Member of Team Exasol.
Having trouble? Just let me know!

Please Give Kudos if you find this post useful and also mark this as Accepted Solution if this is the required solution

LordBertson
Contributor

Sorry for late response, it's just what I did and then it worked like a charm.

luchughes
Contributor

I'm having the same issue and this isn't solving it.

 

df.write \
        .format("jdbc") \
        .option("url", "jdbc:exa:"+host+":8563") \
        .option("dbtable", "analytics.contacts_test_bks") \
        .option("user", "kvm_user") \
        .option("password", password) \
        .option("create_table", "false") \
        .save()

results in the following:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
/tmp/ipykernel_25992/4258359192.py in 
      6         .option("user", "kvm_user") \
      7         .option("password", password) \
----> 8         .option("create_table", "false") \
      9         .save()

/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options)
    823             self.format(format)
    824         if path is None:
--> 825             self._jwrite.save()
    826         else:
    827             self._jwrite.save(path)

/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw)
    126     def deco(*a, **kw):
    127         try:
--> 128             return f(*a, **kw)
    129         except py4j.protocol.Py4JJavaError as e:
    130             converted = convert_exception(e.java_exception)

/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling o143.save.
: java.sql.SQLSyntaxErrorException: syntax error, unexpected IDENTIFIER_LIST_, expecting ',' or ')' [line 1, column 55] (Session: 1718606402551087104)
	at com.exasol.jdbc.ExceptionFactory.createSQLException(ExceptionFactory.java:39)
	at com.exasol.jdbc.EXASQLException.getSQLExceptionIntern(EXASQLException.java:50)
	at com.exasol.jdbc.AbstractEXAStatement.execute(AbstractEXAStatement.java:478)
	at com.exasol.jdbc.EXAStatement.execute(EXAStatement.java:289)
	at com.exasol.jdbc.AbstractEXAStatement.exaExecuteLargeUpdate(AbstractEXAStatement.java:632)
	at com.exasol.jdbc.EXAStatement.executeLargeUpdate(EXAStatement.java:425)
	at com.exasol.jdbc.EXAStatement.executeUpdate(EXAStatement.java:480)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:881)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:81)
	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:175)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:213)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:210)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:171)
	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:122)
	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:121)
	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)
	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)