[英]Moving a tsv file from local file system to S3 in luigi
The following program does not output anything, nor does it throw any errors. 以下程序不输出任何内容,也不引发任何错误。 Am I missing something in form of the
run()
method in the to_S3()
class? 我是否在
to_S3()
类中缺少run()
方法形式的内容?
class to_S3(luigi.Task):
#The class Mysql_to_tsv converts the data returned by a query on a Mysqldb and stores the data in a tsv in a local file.
def requires(self):
return [Mysql_to_tsv]
def output(self):
return luigi.S3Target("https://s3.amazonaws.com/bucket-name/luigi_attempt.tsv")
The output()
method of the Mysql_to_tsv()
class is: Mysql_to_tsv()
类的output()
方法为:
def output(self):
return luigi.LocalTarget('/Users/user/Desktop/Work/Luigi/test_data.tsv')
Please help with the correct class implementation of the task. 请帮助正确执行该任务的类。
What I originally wanted put some data into an S3 bucket. 我本来想要将一些数据放入S3存储桶中。
So, one does not need an output()
method to have run a particular task (Ex: dumping of data to an S3 bucket.) 因此,无需使用
output()
方法即可运行特定任务(例如:将数据转储到S3存储桶中。)
It can be done directly in the run()
method, and the output()
can be used to check for a flag or existence. 可以直接在
run()
方法中完成,而output()
可用于检查标志或存在性。
So, the correct implementation would be: 因此,正确的实现将是:
class to_S3(luigi.Task):
def requires(self):
return [Mysql_to_csv()]
def run(self):
#Creating a connection
access_key = ""
access_secret = ""
conn = S3Connection(access_key, access_secret)
#Connecting to the bucket
bucket_name = ""
bucket = conn.get_bucket(bucket_name)
#Setting up the keys
k = Key(bucket)
k.key = "sample1"
k.set_contents_from_filename("../test_data.tsv")
是的,所有非外部luigi任务都需要run()
方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.