繁体   English   中英

如何使用 Spark Cassandra 连接器保存 Java bean?

[英]How do I save a Java bean using the Spark Cassandra Connector?

我已经阅读了 Spark 文档,但不确定如何使用 Spark Cassandra 连接器将 Java bean 保存到表中?

public class NewImageMetadataRow implements Serializable {

    private final String merchant;
    private final String productId;
    private final String url;
    private final int width;
    private final int height;

    public NewImageMetadataRow(NewImageMetadataRow row) {
        this.merchant = row.getMerchant();
        this.productId = row.getProductId();
        this.url = row.getUrl();
        this.width = row.getWidth();
        this.height = row.getHeight();
    }

    public String getMerchant() {
        return merchant;
    }

    public String getProductId() {
        return productId;
    }

    public String getUrl() {
        return url;
    }

    public int getWidth() {
        return width;
    }

    public int getHeight() {
        return height;
    }
}

我有一个 RDD RDD[NewImageMetadataRow]我试图像这样保存:

myRDD.saveToCassandra(keyspace, "imagemetadatav3", SomeColumns("merchant", "productid", "url"))

这会导致此错误:

java.lang.IllegalArgumentException: requirement failed: Columns not found in com.mridang.image.NewImageMetadataRow: [merchant, productid, url]
    at scala.Predef$.require(Predef.scala:281)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper.columnMapForWriting(DefaultColumnMapper.scala:106)
    at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$$anon$1.<init>(MappedToGettableDataConverter.scala:35)
    at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$.apply(MappedToGettableDataConverter.scala:26)
    at com.datastax.spark.connector.writer.DefaultRowWriter.<init>(DefaultRowWriter.scala:16)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anon$1.rowWriter(DefaultRowWriter.scala:30)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anon$1.rowWriter(DefaultRowWriter.scala:28)
    at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:423)
    at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:35)

根据我的理解(以及糟糕的 Scala foo),它似乎无法从 Java bean 中推断出属性名称。

另一个问题是我的表中的列名都是小写的,删除了空格和连字符,即 getter getProductId对应的 Cassandra 列是productid

(如果我使用的是 Jackson,我可以简单地添加JsonProperty注释。我想知道我可以用 Cassadra Mapper 做同样的事情。)

这花了一些时间,但结果是这样的:

val columns: RowWriterFactory[NewImageMetadataRow] =
  CassandraJavaUtil.mapToRow(classOf[NewImageMetadataRow])

myRDD.saveToCassandra(keyspace, "imagemetadatav3")(CassandraConnector(sc), columns)

bean 中的字段需要是公共的并使用@CqlName注释进行注释。

@CqlName("merchant")
public final String merchant;

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM