[英]Unable to copy data into AWS RedShift
I tried a lot however I am unable to copy data available as json file in S3 bucket(I have read only access to the bucket) to Redshift table using python boto3.我尝试了很多,但是我无法使用 python boto3 将 S3 存储桶中的 json 文件(我对存储桶具有只读访问权限)复制到 Redshift 表。 Below is the python code which I am using to copy the data.
下面是我用来复制数据的 python 代码。 Using the same code I was able to create the tables in which I am trying to copy.
使用相同的代码,我能够创建我试图复制的表。
import configparser
import psycopg2
from sql_queries import create_table_queries, drop_table_queries
def drop_tables(cur, conn):
for query in drop_table_queries:
cur.execute(query)
conn.commit()
def create_tables(cur, conn):
for query in create_table_queries:
cur.execute(query)
conn.commit()
def main():
try:
config = configparser.ConfigParser()
config.read('dwh.cfg')
# conn = psycopg2.connect("host={} dbname={} user={} password={} port={}".format(*config['CLUSTER'].values()))
conn = psycopg2.connect(
host=config.get('CLUSTER', 'HOST'),
database=config.get('CLUSTER', 'DB_NAME'),
user=config.get('CLUSTER', 'DB_USER'),
password=config.get('CLUSTER', 'DB_PASSWORD'),
port=config.get('CLUSTER', 'DB_PORT')
)
cur = conn.cursor()
#drop_tables(cur, conn)
#create_tables(cur, conn)
qry = """copy DWH_STAGE_SONGS_TBL
from 's3://udacity-dend/song-data/A/A/A/TRAAACN128F9355673.json'
iam_role 'arn:aws:iam::xxxxxxx:role/MyRedShiftRole'
format as json 'auto';"""
print(qry)
cur.execute(qry)
# execute a statement
# print('PostgreSQL database version:')
# cur.execute('SELECT version()')
#
# # display the PostgreSQL database server version
# db_version = cur.fetchone()
# print(db_version)
print("Executed successfully")
cur.close()
conn.close()
# close the communication with the PostgreSQL
except Exception as error:
print("Error while processing")
print(error)
if __name__ == "__main__":
main()
I don't see any error in the Pycharm console but I see Aborted status in the redshift query console.我在 Pycharm 控制台中看不到任何错误,但我在 redshift 查询控制台中看到 Aborted 状态。 I don't see any reason why it has been aborted(or I don't know where to look for that)
我看不出它被中止的任何原因(或者我不知道在哪里寻找它)
Other thing that I have noticed is when I run the copy statement in Redshift query editor, it runs fine and data gets moved into the table.我注意到的另一件事是,当我在 Redshift 查询编辑器中运行复制语句时,它运行良好并且数据被移动到表中。 I tried to delete and recreate the cluster but no luck.
我试图删除并重新创建集群,但没有运气。 I am not able to figure what I am doing wrong.
我无法弄清楚我做错了什么。 Thank you
谢谢
Quick read - it looks like you haven't committed the transaction and the COPY is rolled back when the connection closes.快速阅读 - 看起来您还没有提交事务,并且当连接关闭时 COPY 会回滚。 You need to either change the connection configuration to be in "autocommit" or add an explicit "commit()".
您需要将连接配置更改为“自动提交”或添加显式“提交()”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.