使用python将大列表插入Cassandra

Question

I have a problem inserting big list into Cassandra using python. 我在使用python将大列表插入Cassandra时遇到问题。 I have a list of 3200 string that I want to save in Cassandra: 我有一个要保存在Cassandra中的3200字符串列表：

CREATE TABLE IF NOT EXISTS my_test (
                id bigint PRIMARY KEY,
                list_strings list<text>
            );

When I'm reducing my list I have no problem. 当我减少清单时，我没有问题。 It works. 有用。

prepared_statement = session.prepare("INSERT INTO my_test (id, list_strings) VALUES (?, ?)")
        session.execute(prepared_statement, [id, strings[:5]])

But if I keep the totality of my list I have an error: 但是，如果我保留列表的总数，则会出错：

Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed - received 0 responses and 1 failures" info={'required_responses': 1, 'consistency': 'LOCAL_ONE', 'received_responses': 0, 'failures': 1}

How can I insert big list into Cassandra? 如何将大列表插入Cassandra？

Answer 1

A DB array type is not supossed to hold that ammount of data. 不建议使用DB数组类型来保存该数量的数据。 Using different rows of the table to store each string would be better: 使用表的不同行存储每个字符串会更好：

    id     |    time    | strings
-----------+------------+---------
  bigint   | timestamp  | string
 partition | clustering |

Using id as the clustering key would be a bad solution as when requesting all the tweets from a user id, it will require to do a read in multiple nodes while when used as a partition key it will only require to read in one node per user. 使用id作为群集密钥将是一个不好的解决方案，因为当从用户id请求所有tweet时，它将需要在多个节点中进行读取，而当用作分区密钥时，它仅需要在每个用户中读取一个节点。

使用python将大列表插入Cassandra

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-03-13 15:29:01

使用python将大列表插入Cassandra

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-03-13 15:29:01

解决方案1
3 已采纳 2017-03-13 15:29:01