简体   繁体   English

用 Java 将 DataFrame 写入 Cassandra 表

[英]Writing a DataFrame to a Cassandra table in Java

Not finding exactly what I need here.在这里找不到我需要的东西。 Loads of code in scala and Python. Scala 和 Python 中的大量代码。 Here is what I have:这是我所拥有的:

import org.apache.log4j.Logger;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;

public class CassandraWriter {
    private transient Logger logger = Logger.getLogger(CassandraWriter.class);
    private Dataset<Row> hdfsDF;

    public CassandraWriter(Dataset<Row> dataFrame) {
        hdfsDF = dataFrame;
    }

    public void writeToCassandra(String tableName, String keyspace) {
        logger.info("Writing DataFrame to table: " + tableName);

        hdfsDF.write().format("org.apache.spark.sql.cassandra").mode("overwrite")
                .option("table",tableName)
                .option("keyspace",keyspace)
                .save();

        logger.info("Inserted DataFrame to Cassandra successfully");
    }
}

Error I am getting when running is:运行时遇到的错误是:

Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra. Please find packages at http://spark.apache.org/third-party-projects.html

Any idea?任何的想法?

You need to make sure that Spark Cassandra Connector is included into resulting jar that you're submitting.您需要确保 Spark Cassandra 连接器包含在您提交的结果 jar 中。

This either could be done via build so-called fat-jar, and submit it.这可以通过构建所谓的 fat-jar 来完成,然后提交。 For example here is example ( full pom is here ):例如,这里是示例( 完整的 pom 在这里):

...
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <scala.version>2.11.12</scala.version>
    <spark.version>2.4.4</spark.version>
    <spark.scala.version>2.11</spark.scala.version>
    <scc.version>2.4.1</scc.version>
    <java.version>1.8</java.version>
  </properties>

  <dependencies>
     <dependency>
       <groupId>com.datastax.spark</groupId>
       <artifactId>spark-cassandra-connector_${spark.scala.version}</artifactId>
       <version>${scc.version}</version>
     </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_${spark.scala.version}</artifactId>
      <version>${spark.version}</version>
      <scope>provided</scope>
    </dependency>
  </dependencies>
...
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>3.2.0</version>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

Or you can specify spark cassandra connector as package via --packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.2或者您可以通过--packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.2将 spark cassandra 连接器指定为包

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用于将 JSON 响应写入 Cassandra 的示例 Java 代码? - Sample Java code for writing JSON responses to Cassandra? 从 spark dataframe 插入 cassandra 表会导致 org.codehaus.commons.compiler.CompileException: File 'generated.java' 错误 - Inserting into cassandra table from spark dataframe results in org.codehaus.commons.compiler.CompileException: File 'generated.java' Error Trident Storm-Cassandra,使用多个主键写入表 - Trident Storm-Cassandra, writing to a table with multiple primary keys 从 Kafka 主题将数据写入 Cassandra 表失败 - Writing data to Cassandra table from Kafka topic failing 如何在编写之前检查 Java 字符串是否适合 Cassandra TEXT 列? - How to check if a Java String will fit into Cassandra TEXT column before writing it? Java代码中的cassandra更新表 - cassandra update table from java code 如何使用Java截断Cassandra中的表 - How to truncate a table in Cassandra using java Java库,用于在命令行上编写表 - Java library for writing a table on the command line 在Java中将番石榴表值写入文件 - writing guava table values into file in java 使用Java在Cassandra中将数据从一个表复制到另一个表 - Copy data from one table to other in Cassandra using Java
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM