简体   繁体   中英

PYSPARK SQL ODBC connection

I already have a ODBC connection from python to SQL server, I wish to use pyspark to run queries, how can I use my current connection with pyspark.

thanks

Your question is quite broad, but here goes. You can read from a SQL database using:

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

df = (
  spark.read.format("jdbc") 
       .option("url", f"jdbc:{sql_flavour}://{ip}:{port};databaseName={database}") 
       .option("dbtable", "table_name") 
       .option("user", username) 
       .option("password", password) 
       .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") 
       .load()
)

I suppose the important bit is to use the JDBC format, but specify your driver . If you run into issues with this, you might need to download specific drivers/jars. Hope this helps. Please try to include a code snippet or an example of what you tried next time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM