PYSPARK SQL ODBC connection

Question

I already have a ODBC connection from python to SQL server, I wish to use pyspark to run queries, how can I use my current connection with pyspark.

thanks

Answer 1

Your question is quite broad, but here goes. You can read from a SQL database using:

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

df = (
  spark.read.format("jdbc") 
       .option("url", f"jdbc:{sql_flavour}://{ip}:{port};databaseName={database}") 
       .option("dbtable", "table_name") 
       .option("user", username) 
       .option("password", password) 
       .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") 
       .load()
)

I suppose the important bit is to use the JDBC format, but specify your driver . If you run into issues with this, you might need to download specific drivers/jars. Hope this helps. Please try to include a code snippet or an example of what you tried next time.

PYSPARK SQL ODBC connection

Question

1 answers

solution1
0 2019-11-21 11:30:27

PYSPARK SQL ODBC connection

Question

1 answers

solution1 0 2019-11-21 11:30:27

solution1
0 2019-11-21 11:30:27