[英]Mapping UUID in Spark Cassandra connector
I have the following code to save RDD to cassandra:我有以下代码将 RDD 保存到 cassandra:
JavaRDD<UserByID> mapped = ......
CassandraJavaUtil.javaFunctions(mapped)
.writerBuilder("mykeyspace", "user_by_id", mapToRow(UserByID.class)).saveToCassandra();
And UserByID
is a normal serializable POJO with the following variable with getters and setters而UserByID
是一个普通的可序列化 POJO,带有以下带有 getter 和 setter 的变量
private UUID userid;
Cassandra table has exactly the same names of the class UserByID variables, and userid is of type uuid in Cassandra table, I am loading data successfully from the table using the same class mapping. Cassandra 表具有与 UserByID 类变量完全相同的名称,并且 userid 在 Cassandra 表中属于 uuid 类型,我使用相同的类映射从表中成功加载数据。
CassandraJavaRDD<UserByID> UserByIDRDD = javaFunctions(spark)
.cassandraTable("mykeyspace", "user_by_id", mapRowTo(UserByID.class));
however, when I call saveToCassandra
function above, I get the following exception:但是,当我调用上面的saveToCassandra
函数时,出现以下异常:
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0 in stage 227.0 failed 1 times, most recent failure: Lost task 0.0
in stage 227.0 (TID 12721, localhost, executor driver):
java.lang.IllegalArgumentException:
The value (4e22e71a-a387-4de8-baf1-0ef6e65fe33e) of the type
(java.util.UUID) cannot be converted to
struct<leastSignificantBits:bigint,mostSignificantBits:bigint>
To solve the problem I have registered UUID codec, but that didn't help, I am using spark-cassandra-connector_2.11
version 2.4.0 and the same version for spark-core_2.11
any suggestion?为了解决这个问题,我已经注册了 UUID 编解码器,但这没有帮助,我正在使用spark-cassandra-connector_2.11
版本 2.4.0 和相同版本的spark-core_2.11
有什么建议吗?
my reference is here but it has no Java UUID example, your help is appreciated.我的参考在这里,但它没有 Java UUID 示例,感谢您的帮助。
That's really strange error - this just works fine with connector 2.4.0 & Spark 2.2.1 with following example:这真是一个奇怪的错误——这在连接器 2.4.0 和 Spark 2.2.1 中工作正常,示例如下:
Table definition:表定义:
CREATE TABLE test.utest (
id int PRIMARY KEY,
u uuid
);
POJO class : POJO类:
public class UUIDData {
private UUID u;
private int id;
...
// getters/setters
};
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("UUIDTest")
.getOrCreate();
CassandraJavaRDD<UUIDData> uuids = javaFunctions(spark.sparkContext())
.cassandraTable("test", "utest", mapRowTo(UUIDData.class));
JavaRDD<UUIDData> uuids2 = uuids.map(x -> new UUIDData(x.getId() + 10, x.getU()));
CassandraJavaUtil.javaFunctions(uuids2)
.writerBuilder("test", "utest", mapToRow(UUIDData.class))
.saveToCassandra();
}
I've noticed that in your code you're using functions mapRowTo
and mapToRow
without calling the .class
on POJO - are you sure that your code compiled and you don't run any old version of the code?我注意到在您的代码中您使用函数mapRowTo
和mapToRow
而没有调用 POJO 上的.class
- 您确定您的代码已编译并且您没有运行任何旧版本的代码吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.