简体   繁体   English

AWS Data Pipeline将CSV从S3复制到RDS MySQL

[英]AWS Data Pipeline to copy CSV from S3 to RDS MySQL

I have a directory within my S3 bucket that contains many .CSV files that are all formatted the same way (First, Last, Location, Date). 我的S3存储桶中有一个目录,其中包含许多.CSV文件,这些文件的格式相同(First,Last,Location,Date)。

I have been trying to use Data Pipeline to populate an RDS MySQL Database table with the contents of these CSV files. 我一直在尝试使用Data Pipeline用这些CSV文件的内容填充RDS MySQL数据库表。 Fortunately, Amazon provides a template for this action already. 幸运的是,亚马逊已经为此操作提供了一个模板。

"Load S3 data into RDS MySQL table" http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-copys3tords.html “将S3数据加载到RDS MySQL表” http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-copys3tords.html

I have filled out all of the appropriate information that it requests. 我填写了它要求的所有适当信息。

When I activate the pipeline, it creates the CopyActivity and the ShellCommandActivity. 当我激活管道时,它会创建CopyActivity和ShellCommandActivity。 Copy activity copies the data, and Shell command activity creates the table if it thinks it isn't there already. 复制活动复制数据,如果表认为它已经存在,则Shell命令活动会创建表。 The Shell command activity successfully connects to my RDS. Shell命令活动成功连接到我的RDS。

However, my issue is that the ShellCommandActivity switches to "FINISHED" status without actually creating a table, and then the CopyActivity gets stuck at "WAITING_ON_DEPENDENCIES". 但是,我的问题是ShellCommandActivity在没有实际创建表的情况下切换到“FINISHED”状态,然后CopyActivity卡在“WAITING_ON_DEPENDENCIES”。 This whole process takes around 20 minutes. 整个过程大约需要20分钟。

All of my roles have full access to all of the services. 我的所有角色都可以完全访问所有服务。

If anyone has any insight, please comment. 如果有人有任何见解,请发表评论。 I have been stuck on this issue for nearly 2 weeks now. 我已经被困在这个问题上近两周了。

我尝试使用SQLActivity选择要创建的表中的所有数据,这个新的SQLActivity将是CopyActivity的依赖项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM