[英]When I try fetch data from Amazon Keyspaces with Pyspark, I get Unsupported partitioner: com.amazonaws.cassandra.DefaultPartitioner Error
I'm not experienced in Java or Hadoop ecosystem.我对 Java 或 Hadoop 生态系统没有经验。 I configured my Spark cluster to connect to Amazon Keyspaces by using spark-cassandra-connector from Datastax.
我使用 Datastax 的 spark-cassandra-connector 配置我的 Spark 集群以连接到 Amazon Keyspaces。 I'm using Pyspark to fetch data from Cassandra. I can successfully connect to Keyspaces/Cassandra cluster.
我正在使用 Pyspark 从 Cassandra 获取数据。我可以成功连接到 Keyspaces/Cassandra 集群。 But, when I try to fetch data from it.
但是,当我尝试从中获取数据时。
df = spark.sql("SELECT * FROM cass.tutorialkeyspace.tutorialtable")
print ("Table Row Count: ")
print (df.count())
I get this error:我收到此错误:
Unsupported partitioner: com.amazonaws.cassandra.DefaultPartitioner
Yes, keyspace & table exists and has data.是的,keyspace & table 存在并且有数据。 How can I fix/workaround this?
我该如何解决/解决这个问题? Thanks!
谢谢!
As an FYI, Keyspaces now supports using the RandomPartitioner, which enables reading and writing data in Apache Spark by using the open-source Spark Cassandra Connector.作为 FYI,Keyspaces 现在支持使用 RandomPartitioner,它可以通过使用开源 Spark Cassandra 连接器在 Apache Spark 中读取和写入数据。
Docs: https://docs.aws.amazon.com/keyspaces/latest/devguide/spark-integrating.html文档: https://docs.aws.amazon.com/keyspaces/latest/devguide/spark-integrating.html
Launch announcement: https://aws.amazon.com/about-aws/whats-new/2022/04/amazon-keyspaces-read-write-data-apache-spark/上线公告: https://aws.amazon.com/about-aws/whats-new/2022/04/amazon-keyspaces-read-write-data-apache-spark/
Spark Cassandra Connector is relying on specific partitioner implementation to define data splits, etc. There is no workaround for this problem right now, until somebody adds the implementation of corresponding TokenFactory into this code . Spark Cassandra 连接器依赖于特定的分区器实现来定义数据拆分等。目前没有解决此问题的方法,直到有人将相应的 TokenFactory 的实现添加到此代码中。 It shouldn't be very complex, just should be done by someone who is interested in it.
它不应该很复杂,应该由对此感兴趣的人来完成。
Thank you for the feedback.感谢您的反馈。 At this time, You can write to Keyspaces using the Cassandra Spark Connector.
此时,您可以使用 Cassandra Spark Connector 写入 Keyspaces。 Reading requires support for token rage.
阅读需要令牌愤怒的支持。 Please see the following doc page to see list of supported APIs https://docs.aws.amazon.com/keyspaces/latest/devguide/cassandra-apis.html .
请参阅以下文档页面以查看支持的 API 列表https://docs.aws.amazon.com/keyspaces/latest/devguide/cassandra-apis.html 。
Although we don't have timelines to share at the moment, we prioritize our roadmap based on customer feedback.虽然我们目前没有时间表可以分享,但我们会根据客户反馈确定路线图的优先级。 We are releasing new features all the time.
我们一直在发布新功能。 To learn more about our roadmap and upcoming features please contact your AWS Account manager.
要详细了解我们的路线图和即将推出的功能,请联系您的 AWS 客户经理。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.