简体   繁体   English

通过 Python 使用 BULK INSERT

[英]using BULK INSERT via Python

I have problems splitting the values of bulk-insert because the idea is to make 1 insert every 10 values at a time and reading the entire contents of CSV file我在拆分 bulk-insert 的值时遇到问题,因为我的想法是每次每 10 个值插入 1 个并读取 CSV 文件的全部内容

The code already inserts in a single line reading the entire CSV file but I am unable to perform the division of VALUES in the case in the future perform an insert of 10 thousand values at a time.代码已经插入到读取整个 CSV 文件的单行中,但是在将来一次插入 10000 个值的情况下,我无法执行 VALUES 的除法。

def bulk_insert(table_name, **kwargs):

    mysqlConnection = MySqlHook(mysql_conn_id='id_db')
    a = mysqlConnection.get_conn()
    c = a.cursor()

    with open('/pasta/arquivo.csv') as f: 
        reader = csv.reader(f, delimiter='\t')

        sql ="""INSERT INTO user (id,user_name) VALUES""" 

            for row in reader:           
                sql +="(" + row[0] + " , '" + row[1] + "'),"
            c.execute(sql[:-1])  

    a.commit()

Something like this ought to work.像这样的东西应该可以工作。 The batch_csv function is a generator that yields a list of rows of size size on each iteration. batch_csv function 是一个生成器,它在每次迭代时生成大小size的行列表。

The bulk_insert function is amended to use parameter substitution and the cursor's executemany method. bulk_insert function 被修改为使用参数替换和游标的executemany方法。 Parameter substitution is safer than manually constructing SQL.参数替换比手动构造 SQL 更安全。

cursor.executemany may batch SQL inserts as in the original function, though this is implementation-dependent and should be tested. cursor.executemany可以像在原始 function 中一样批处理 SQL 插入,尽管这取决于实现并且应该进行测试。

def batch_csv(size=10):
    with open('/pasta/arquivo.csv') as f: 
        reader = csv.reader(f, delimiter='\t')
        batch = []
        for row in reader:
            batch.append(row)
            if len(row) == size:
                yield batch
                del batch[:]
        yield batch


def bulk_insert(table_name, **kwargs):

    mysqlConnection = MySqlHook(mysql_conn_id='id_db')
    a = mysqlConnection.get_conn()
    c = a.cursor()
    sql ="""INSERT INTO user (id,user_name) VALUES (%s, %s)""" 
    batcher = batch_csv()
    for batch in batcher:
        c.executemany(sql, [row[0:2] for row in batch])  

    a.commit()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM