简体   繁体   中英

Proper way to insert iterative data into Cassandra using Python

Let's say I have cassandra table define like this:

CREATE TABLE IF NOT EXISTS {} (
            user_id bigint ,
            username text,
            age int,
            PRIMARY KEY (user_id)
        );

I have 3 list of same size let's 1 000 000 records in each list. Is it a good practice to insert data using a for loop like this:

for index, user_id in enumerate(user_ids):
    query = "INSERT INTO TABLE (user_id, username, age) VALUES ({0}, '{1}', {1});".format(user_id, username[index] ,age[index])
    session.execute(query)

Its probably a good idea to start by looking at the python driver getting started guide . If you have already seen that then apologies but I thought it worth mentioning.

Generally speaking you'd create your session object and then do your inserts inside your loop, probably using something like a prepared statement (talked about further down the getting started page) but also here and here

The example of the above page uses this as a good starting point

user_lookup_stmt = session.prepare("SELECT * FROM users WHERE user_id=?")

users = []
for user_id in user_ids_to_query:
    user = session.execute(user_lookup_stmt, [user_id])
    users.append(user)

You may also find this blog helps when talking about better throughput with the python driver

You might find the python driver github page a useful resource, in particular I found this example using a prepared statement here that might help you too.

Prepared statements with concurrent execution will be your best bet. The driver provides utility functions for concurrent execution of statements with sequences of parameters, just as you have with your lists: execute_concurrent_with_args

Zipping your lists together will produce a sequence of parameter tuples suitable for input to that function.

Something like this:

prepared = session.prepare("INSERT INTO table (user_id, username, age) VALUES (?, ?, ?)")
execute_concurrent_with_args(session, prepared, zip(user_ids, username, age))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM