[英]java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.parquet
I spent the whole day trying to solve the problem explained below. 我花了一整天时间试图解决下面解释的问题。 I would really appreciate a loooot if someone could point me my error.
如果有人能指出我的错误,我真的很感激loooot。
When I package my Spark Java project and run it with spark-submit, the error happens at this line of code: 当我打包我的Spark Java项目并使用spark-submit运行它时,错误发生在这行代码中:
DataFrame parquetFile = sqlContext.read().parquet("s3n://" +
aws_bucket_data + "/" +
aws_path);
When I runned the same program in Intellij, it worked fine (there are no connection issues with S3, abd the problem refers to DataFrame). 当我在Intellij中运行相同的程序时,它运行正常(S3没有连接问题,abd问题引用DataFrame)。 I used Intellij to package the project.
我使用Intellij打包项目。 I know that DataFrame was substituted by DataSet in Spark 2.0.0, but I am using Spark 1.6.2!!!
我知道DataFrame被Spark 2.0.0中的DataSet取代,但我使用的是Spark 1.6.2!
16/11/15 19:51:17 INFO SharedState: Warehouse path is 'file:/usr/local/spark-1.6.2-bin-hadoop2.6/spark-warehouse'.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.parquet([Ljava/lang/String;)Lorg/apache/spark/sql/DataFrame;
at org.test.Manager.run(Manager.java:55)
at org.test.Runner.main(Runner.java:24)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
This is my POM.xml file: 这是我的POM.xml文件:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.test</groupId>
<artifactId>runner</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
<spark.version>1.6.2</spark.version>
<jackson.version>2.8.3</jackson.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>${jackson.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${jackson.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>${jackson.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>${jackson.version}</version>
</dependency>
<dependency>
<groupId>org.sedis</groupId>
<artifactId>sedis_2.10</artifactId>
<version>1.2.2</version>
</dependency>
<dependency>
<groupId>com.lambdaworks</groupId>
<artifactId>jacks_2.10</artifactId>
<version>2.3.3</version>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.11.53</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<id>build-a</id>
<configuration>
<archive>
<manifest>
<mainClass>org.test.Runner</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<finalName>runner</finalName>
</configuration>
<phase>package</phase>
<goals>
<goal>assembly</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
UPDATE: 更新:
I tried to compile with mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package -Xlint:deprecation
, and this is the detailed stacktrace: 我尝试使用
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package -Xlint:deprecation
进行编译,这是详细的mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package -Xlint:deprecation
:
[ERROR] No plugin found for prefix 'lint' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (/home/meeee/.m2/repository), central (https://repo.maven.apache.org/maven2)] -> [Help 1]
org.apache.maven.plugin.prefix.NoPluginFoundForPrefixException: No plugin found for prefix 'lint' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (/home/meeee/.m2/repository), central (https://repo.maven.apache.org/maven2)]
Spark SQL 1.6.2 does not include the missing method ( parquet(String path)
) Spark SQL 1.6.2不包含缺少的方法(
parquet(String path)
)
public Dataset<Row> parquet(String path)
Since: 2.0.0
从:2.0.0
Use version 2.0.0 or above 使用2.0.0或更高版本
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.