简体   繁体   English

使用python将大列表插入Cassandra

[英]Insert big list into Cassandra using python

I have a problem inserting big list into Cassandra using python. 我在使用python将大列表插入Cassandra时遇到问题。 I have a list of 3200 string that I want to save in Cassandra: 我有一个要保存在Cassandra中的3200字符串列表:

CREATE TABLE IF NOT EXISTS my_test (
                id bigint PRIMARY KEY,
                list_strings list<text>
            );

When I'm reducing my list I have no problem. 当我减少清单时,我没有问题。 It works. 有用。

prepared_statement = session.prepare("INSERT INTO my_test (id, list_strings) VALUES (?, ?)")
        session.execute(prepared_statement, [id, strings[:5]])

But if I keep the totality of my list I have an error: 但是,如果我保留列表的总数,则会出错:

Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed - received 0 responses and 1 failures" info={'required_responses': 1, 'consistency': 'LOCAL_ONE', 'received_responses': 0, 'failures': 1}

How can I insert big list into Cassandra? 如何将大列表插入Cassandra?

A DB array type is not supossed to hold that ammount of data. 不建议使用DB数组类型来保存该数量的数据。 Using different rows of the table to store each string would be better: 使用表的不同行存储每个字符串会更好:

    id     |    time    | strings
-----------+------------+---------
  bigint   | timestamp  | string
 partition | clustering |

Using id as the clustering key would be a bad solution as when requesting all the tweets from a user id, it will require to do a read in multiple nodes while when used as a partition key it will only require to read in one node per user. 使用id作为群集密钥将是一个不好的解决方案,因为当从用户id请求所有tweet时,它将需要在多个节点中进行读取,而当用作分区密钥时,它仅需要在每个用户中读取一个节点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM