简体   繁体   English

如何使用pySpark将csv文件内容插入到postgreSQL表中?

[英]How to insert csv file contents to postgreSQL table using pySpark?

I want to insert data from a csv file to a postgreSQL table. 我想将数据从csv文件插入到postgreSQL表中。 I have written code for fetching data from a csv file like this 我已经编写了用于从这样的csv文件中获取数据的代码

myData = spark.read.format("csv").option("header","true").load("D:/sample.csv")

I got the file contents in 'myData' variable.I have written database connection like the following. 我在'myData'变量中获取了文件内容。我已经编写了如下数据库连接。

url = 'postgresql://myPath';

properties = {
    "user": "postgres",
    "driver": "org.postgresql.Driver",
    "password":""
};

df = DataFrameReader(sqlContext).jdbc(
    url='jdbc:%s' % url, table='pyspark_user', properties=properties
);

When I print df, it showing like this 当我打印df时,它显示如下

DataFrame[id: int, firstname: string, lastname: string, email: string, password: string]

How can I insert the rows of 'myData' to the table 'pyspark_user' 如何将“ myData”的行插入表“ pyspark_user”

myData.write.format('jdbc').options(
      url='jdbc:%s' % url,
      driver='org.postgresql.Driver',
      dbtable='pyspark_user',
      user='postgres',
      password='').mode('append').save()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM