簡體   English   中英

Spark類型不匹配:無法從DataFrame轉換為Dataset <Row>

[英]Spark Type mismatch: cannot convert from DataFrame to Dataset<Row>

我收到一個奇怪的錯誤消息

Type mismatch: cannot convert from DataFrame to Dataset<Row>

當我嘗試從此處實現示例代碼時。

這是給我錯誤的行。

    Dataset<Row> verDF = spark.createDataFrame(uList, User.class);

我還在這里查看了Spark的文檔,該文檔給出了相同的示例。 但是我不確定為什么我的情況不起作用。

這是我的imports

import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.graphx.*;
import org.apache.spark.graphx.lib.*;
import org.apache.spark.rdd.RDD;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.storage.StorageLevel;
import org.graphframes.GraphFrame;

import scala.Tuple2;
import scala.collection.Iterator;
import scala.collection.immutable.Map;
import scala.collection.immutable.Seq;

以下是相關的依賴項:

    <repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
        </repository>
        <repository>
            <id>SparkPackagesRepo</id>
            <url>http://dl.bintray.com/spark-packages/maven</url>
        </repository>
    </repositories>

    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    <dependency>
        <groupId>graphframes</groupId>
        <artifactId>graphframes</artifactId>
        <version>0.2.0-spark2.0-s_2.11</version>
    </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_2.10</artifactId>
            <version>1.3.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-graphx_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-graphx_2.10</artifactId>
            <version>2.1.0</version>
        </dependency>




    </dependencies>

解決了問題。 我使用了以下依賴關系,並使用了SparkSession實例來創建數據SparkSession

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.11</artifactId>
    <version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>   

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM