使用python将数据插入redshift

Question

I'm trying to insert multiple rows into amazon redshift database , the rows included in a list of tuples which looks like this: 我正在尝试将多个行插入到Amazon redshift数据库中，该行包含在元组列表中，如下所示：

my_rows=[(1, 0.0, 0, 0.0, 2010188534, 1816780086, 1113834, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.0, 1, 0.0, 2010188536, 1816780086, 1119396, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.0, 2, 0.0, 2010188538, 1816780086, 1119398, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.0, 3, 0.0, 2010188540, 1816780086, 1123612, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.5, 0, 0.0, 2010188542, 1816780102, 1086852, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.5, 1, 0.0, 2010188544, 1816780102, 1087014, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.3, 2, 0.0, 2010188546, 1816780102, 1089224, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.3, 3, 0.0, 2010188548, 1816780102, 1089348, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17'), (1, 0.3, 4, 0.0, 2010188550, 1816780102, 1122564, '2018-03-07 09:40:17', '2018-03-07 09:40:17', '2018-03-07 09:40:17')]

Some columns may contain None 有些列可能包含None

I'm inserting them row by row into Redshift database this way: 我以这种方式将它们逐行插入Redshift数据库：

    cur = con.cursor()
    columns_names=("c1","c2","c3","c4","c5","c6","c7","c8","c9","c10")
    insert_reference=len(my_rows[0])*"%s,"
    values_references="("+insert_reference[0:-1]+")"
    for row in my_rows:
      cur = con.cursor()
      insert_query="INSERT INTO "+table+" "+columns_names+" VALUES "+values_references+";"
      cur.execute(insert_query, row)

The problem is that when I run this code, it blocks on the first row without raising any error. 问题是，当我运行此代码时，它在第一行被阻塞而没有引发任何错误。 So, my questions are : Is it normal that it takes so much time to insert one row ? 所以，我的问题是：插入一行这么多的时间是否正常？ If not is there some error in my code ? 如果没有，我的代码中是否有错误？ Is there some efficient way to that ? 有一些有效的方法吗？

Can i get some help please ? 我可以帮忙吗？ Thank you in advance 先感谢您

Answer 1

The process you should follow: 您应遵循的过程：

write your data in csv format to an s3 folder, ideally gzipped 将您的数据以csv格式写入到s3文件夹中，最好将其压缩
run a redshift copy command to import that data into a temporary table in redshift 运行redshift copy命令以将该数据导入redshift中的临时表
run redshift sql to insert that data into your table 运行redshift sql将数据插入表中

That will run fast, is the correct & recommended way and will be scaleable. 这样可以快速运行，是正确且推荐的方法，并且可以扩展。

使用python将数据插入redshift

问题描述

1 个解决方案

解决方案1
0 2018-03-30 09:49:24

使用python将数据插入redshift

问题描述

1 个解决方案

解决方案1 0 2018-03-30 09:49:24

解决方案1
0 2018-03-30 09:49:24