简体   繁体   English

py4JJavaError:调用 o253.load 时出错。 : java.lang.ClassNotFoundException: 找不到数据源: bigquery

[英]py4JJavaError: An error occurred while calling o253.load. : java.lang.ClassNotFoundException: Failed to find data source: bigquery

Trying to read data from bigquery to jupyter notebook with pyspark libraries.尝试使用 pyspark 库将数据从 bigquery 读取到 jupyter notebook。 All of the apache spark and java hvae been downloaded to my C:Drive.所有 apache spark 和 java hvae 都已下载到我的 C:Drive。 Read and watched tutorial videos but none of them which seem to work.阅读并观看了教程视频,但它们似乎都不起作用。 looking for guidance寻求指导

Code:代码:

import pyspark 
import findspark
from pyspark import SparkContext,SparkConf 
from pyspark.sql import SparkSession
from pyspark.sql.functions import window, col, year, month, aggregate, date_add, 
timestamp_seconds, rank, split
from pyspark.sql.types import StructField, StructType, StringType, BooleanType, DoubleType, 
StringType, IntegerType, FloatType
#import com.google.cloud.spark.bigquery 
#this creates spark UI - check current spark session
spark =SparkSession.builder.master('local[*]').appName('conversions').enableHiveSupport().getOrCreate() 
df = spark.read.format('bigquery').load('table')
df.show()

error: Py4JJavaError: An error occurred while calling o253.load.错误:Py4JJavaError:调用 o253.load 时出错。 : java.lang.ClassNotFoundException: Failed to find data source: bigquery. :java.lang.ClassNotFoundException:找不到数据源:bigquery。 Please find packages at http://spark.apache.org/third-party-projects.html请在http://spark.apache.org/third-party-projects.html找到包

Please change the SparkSession creation to请将 SparkSession 创建更改为

spark =SparkSession.builder \
  .master('local[*]') \
  .appName('conversions') \
  .enableHiveSupport() \
  .conf('spark.jars.packages', 'com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.23.2') \
  .getOrCreate() 

Also, please make sure you are using a python notebook rather than a pyspark notebook - otherwise Jupyter will create the SparkSession for you and no additional packages can be added.另外,请确保您使用的是 python 笔记本而不是 pyspark 笔记本 - 否则 Jupyter 将为您创建 SparkSession,并且无法添加其他包。

See more documentation in theconnector's repo .连接器的 repo中查看更多文档。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Py4JJavaError:调用 o41.load 时出错。 : java.lang.ClassNotFoundException: - Py4JJavaError: An error occurred while calling o41.load. : java.lang.ClassNotFoundException: Pyspark 错误 - Py4JJavaError:调用 o731.load 时出错 - Pyspark Error - Py4JJavaError: An error occurred while calling o731.load Py4JJavaError:调用z:org.apache.spark.api.python.PythonRDD.collectAndServe时发生错误。 :java.lang.IllegalArgumentException - Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : java.lang.IllegalArgumentException Spark:Py4JJavaError: 调用 o142.saveAsTextFile 时出错 - Spark:Py4JJavaError: An error occurred while calling o142.saveAsTextFile Py4JJavaError:调用 o26.parquet 时发生错误。 (阅读 Parquet 文件) - Py4JJavaError: An error occurred while calling o26.parquet. (Reading Parquet file) Py4JJavaError:调用 o1670.collectToPython 时出错 - Py4JJavaError: An error occurred while calling o1670.collectToPython Py4JJavaError:调用o288.fit时发生错误 - Py4JJavaError: An error occurred while calling o288.fit 使用PySpark和Kafka,Py4JJavaError进行结构化流传输:调用o70.awaitTermination时发生错误 - Structured Streaming using PySpark and Kafka, Py4JJavaError: An error occurred while calling o70.awaitTermination Py4JJavaError:调用 o389.csv 时出错 - Py4JJavaError: An error occurred while calling o389.csv Spark 数据帧不会显示() - Py4JJavaError:调用 o426.showString 时发生错误 - Spark dataframe will not show() - Py4JJavaError: An error occurred while calling o426.showString
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM