简体   繁体   English

如何使用 PySpark 和 SparkSession 设置到 HIVE 的连接(如何添加用户名和密码)?

[英]How to setup connection to HIVE using PySpark and SparkSession (How do I add username and password)?

I have been trying to access tables in Hive using PySpark and after reading a few other posts, this is the way people recommend connecting to Hive.我一直在尝试使用PySpark访问 Hive 中的表,在阅读了其他一些帖子后,这是人们推荐连接到 Hive 的方式。 But it doesn't work.但它不起作用。 Then I realize I must probably pass my username and password, but I can't understand how to do it.然后我意识到我可能必须传递我的用户名和密码,但我不明白该怎么做。 So is there a way to pass the username and pw when setting up SparkSession , or what else could be the problem?那么有没有办法在设置SparkSession时传递用户名和密码,或者还有什么问题?

import sys
from pyspark import SparkContext, SparkConf, HiveContext
from pyspark.sql import SparkSession

if __name__ == "__main__":

# create Spark context with Spark configuration
spark = SparkSession.builder()
      .appName("interfacing spark sql to hive metastore without configuration file")
      .config("hive.metastore.uris", "thrift://my_server:10000")
      .enableHiveSupport()
      .getOrCreate()
sc = spark.sparkContext
df = sc.parallelize([(1, 2, 3, 'a b c'),(4, 5, 6, 'd e f'),(7, 8, 9, 'g h i')]).toDF(['col1', 'col2', 'col3','col4'])
df.write.mode("overwrite").saveAsTable("test_spark")

Traceback追溯

Exception in thread "main" org.apache.spark.SparkException: Application application_1575789516697_258641 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1168)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:780)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Spark connects to Hive directly no need to pass user name and password just pass the hive-site.xml while submit the spark application. Spark 直接连接Hive 不需要传递用户名和密码,只需在提交spark 应用程序时传递hive-site.xml。

Use this bellow code,使用此波纹管代码,

 from pyspark.sql import SparkSession

   sparkSession = SparkSession.builder.appName("ApplicationName").enableHiveSupport().getOrCreate()


While submitting your application pass the hive-site.xml file, AS,在提交您的应用程序时传递 hive-site.xml 文件 AS,

spark-submit --files /<location>/hive-site.xml --py-files <List_of_Pyfiles> 

try adding below to config尝试在下面添加到配置

.config("spark.sql.warehouse.dir", your_warehouse_location)

Use this as reference. 以此作为参考。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 pyspark 在 Spark 2.0 中构建 sparkSession? - How to build a sparkSession in Spark 2.0 using pyspark? 使用pyspark创建sparksession后是否需要停止spark? - Do I need to stop spark after creating sparksession using pyspark? 如何在Python的数据库列表中添加新的用户名和密码? - How do I add a new username and password to a database list in Python? 如何将循环添加到用户名和密码登录(Python)? - How do I add a loop to a username and password login (Python)? 我如何通过python使用用户名和密码创建SSL连接服务器? - How i could to make SSl Connection server using username, password by python? 如何将新列添加到 Spark DataFrame(使用 PySpark)? - How do I add a new column to a Spark DataFrame (using PySpark)? 如何将用户名和密码代理添加到从Azure中的磁盘创建的VM - How do I add username and password agent to a VM created from disk in azure 在 pyspark sparksession 中检查 Hive 中是否存在表 - Check if a table exists in Hive in pyspark sparksession 使用SparkSession.builder时如何设置profiler_cls? - How do I set profiler_cls when using SparkSession.builder? 如何获取Python MYSQL连接用户名和密码? - How to get Python MYSQL connection username and password?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM