Start a new topic

Connection to Spark(SQL)

Hej


I try to connect to the Spark endpoint of the statistics office of the UK. Setting up a generic driver over Marven has resulted in the following warning. Well, to me, it seems more like a showstopper.


Could not create driver 'SparkSQL'.
Possible due to wrong or missing jars, try to check 'Download dependencies' in Maven Editor
Details org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.()

[Exception Chain]:
1: Root-> (NoSuchMethodException) org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.()

 

Could not create driver 'SparkSQL'.
Possible due to wrong or missing jars, try to check 'Download dependencies' in Maven Editor
Details org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.()

[Exception Chain]:
1: Root-> (NoSuchMethodException) org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.()

 

The dependencies are set to get downloaded as far as I understand. In fact, it does download "something" and it actually finds something.

image


Well, it finds something when the extension is left empty or if jar is selected there. With the other extensions, there is not found a DriverWrapper, in fact, nothing at all.

image



Has anyone an idea?


I also tried out DBeaver, in vain. Selecting Spark there results in the use of a Hive driver, strange, so in any case, I tried the Hive driver in DbVis too. With the same result as in DBeaver: Could not open client transport with JDBC Uri: jdbc:hive2://https:-1//statistics.data.gov.uk/sparql:10000/default: Cannot open without port.


Cheers


Thiemo


Hi Thiemo,


What the ONS provides is a SPARQL endpoint. Aside from confusing similarities in the names, this is unrelated to Spark SQL, which in turn means that Spark/Hive drivers won't work.


Connecting DbVisualizer to a SPARQL endpoint would require a SPARQL JDBC driver. Some brief googling turned up the Apache Jena project, which provides a JDBC driver for connecting to SPARQL endpoints. Unfortunately, as I tried it out, I wasn't able to get DbVisualizer to recognize it as a valid JDBC driver.


The Apache also lists a small number of alternatives to Jena. Whether or not these work better (or at all) is not something I can answer at this time.


(Disclaimer: DbVisualizer is not affiliated with Apache Jena or any of the other drivers. The links are provided for informational purposes only.)


Cheers,

Tomas


1 person likes this

Hej Tomas


Thanks for looking into my issue. And I just learnt something again. I thought, SPARQL and Spar SQL was just another naming.


I probably will look into JENA and its references to other JDBC drivers.


Thanks again


Ha det bra


Thiemo

I have not looked into the Jena alternatives mentioned by Apache, actually, I did not look at it at all. But I have been able to connect to a SPARQL endpoint.

image


I set the driver up as follows.

image

The details about the Maven artifact can be found at https://jena.apache.org/documentation/jdbc/artifacts.html


image


I believe, the URL format is irrelevant, because the Database URL of the connection needs to be set completely.


However, DbVis does not visualise anything. All is quite empty. I am inclined to believe that this is in Jena's very nature. Only providing an interface, with which one can send queries. Having said that, I am astonished there is something to see at all.


Cheers


Thiemo

Login or Signup to post a comment