[英]Spark Cassandra connector with Java for read
Requirement:- I have persisted data in cassandra and on hourly basis i need to calculate some score based on updates happening to records.I see data coming correctly by using the show() method on dataset要求:- 我在 cassandra 中保存了数据,并且每小时我需要根据记录发生的更新计算一些分数。我通过在数据集上使用 show() 方法看到数据正确
Below code to read data:-下面的代码读取数据: -
Dataset<DealFeedSchema> dealFeedSchemaDataset = session.read()
.format(Constants.SPARK_CASSANDRA_SOURCE_PATH)
.option(Constants.KEY_SPACE, Constants.CASSANDRA_KEY_SPACE)
.option(Constants.TABLE, Constants.CASSANDRA_DEAL_TABLE_SPACE)
.option(Constants.DATE_FORMAT, "yyyy-MM-dd HH:mm:ss")
.schema(DealFeedSchema.getDealFeedSchema())
.load()
.as(Encoders.bean(DealFeedSchema.class));
dealFeedSchemaDataset.show();
output of show is below: output 显示如下:
+-------+----------+-------------+--------------------+-----------+------------+----------+------------------------+---------------+-----------+-------------------+-------------------+----------+------------+-------------------+----------------+-------------------+-------------+----------+--------------------+----------+-------------------------+---------------+----------------+---------------+--------------+--------------+-----+
|deal_id| deal_name|deal_category| deal_tags|growth_tags|deal_tag_ids|deal_price|deal_discount_percentage|deal_group_size|deal_active| deal_start_time| deal_expiry|product_id|product_name|product_description|product_category|product_category_id|product_price|hero_image| product_images| video_url|video_thumbnail_image_url|deal_like_count|deal_share_count|deal_view_count|deal_buy_count|weighted_score|boost|
+-------+----------+-------------+--------------------+-----------+------------+----------+------------------------+---------------+-----------+-------------------+-------------------+----------+------------+-------------------+----------------+-------------------+-------------+----------+--------------------+----------+-------------------------+---------------+----------------+---------------+--------------+--------------+-----+
| 4|7h12349961| mqw|[under999, under3...| []| []| 4969.0| null| 95166551| 1|2020-07-08 14:48:57|2020-07-18 14:48:57|4725457233| kao62ggnm7| 32h64e356z| jnnh29zr1f| null| 6651.0|86kk7s34yr|[dSt4P79, i4WXOHb...|d6tag27924| 4j1l36lp17| null| null| null| null| null| null|
So here's weired thing that happens when i use map/foreach on dealFeedSchemaDataset the data seems not correct i get the column value of deal_start_time as current system time something like below, not sure how this gets changed.因此,当我在dealFeedSchemaDataset上使用 map/foreach 时发生了一件奇怪的事情,数据似乎不正确我得到了deal_start_time的列值作为当前系统时间,如下所示,不确定这是如何改变的。
even below line gives same issue:即使在下面的行给出了同样的问题:
dealFeedSchemaDataset.select(
functions.col("deal_start_time")).as(Encoders.bean(DateTime.class))
.collectAsList().forEach(schema -> System.out.println(schema));
2020-07-10T20:21:47.895+05:30
can someone help me with what am i doing wrong?有人可以帮我解决我做错了什么吗?
java.sql.Timestamp
This for using formats which contains time part in it这用于使用其中包含时间部分的格式
java.sql.Date
This for just the dates这只是日期
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.