简体   繁体   中英

Moving a tsv file from local file system to S3 in luigi

The following program does not output anything, nor does it throw any errors. Am I missing something in form of the run() method in the to_S3() class?

class to_S3(luigi.Task):

    #The class Mysql_to_tsv converts the data returned by a query on a Mysqldb and stores the data in a tsv in a local file.

    def requires(self):
        return [Mysql_to_tsv]

    def output(self):
        return luigi.S3Target("https://s3.amazonaws.com/bucket-name/luigi_attempt.tsv")

The output() method of the Mysql_to_tsv() class is:

def output(self):
        return luigi.LocalTarget('/Users/user/Desktop/Work/Luigi/test_data.tsv')

Please help with the correct class implementation of the task.

What I originally wanted put some data into an S3 bucket.

So, one does not need an output() method to have run a particular task (Ex: dumping of data to an S3 bucket.)

It can be done directly in the run() method, and the output() can be used to check for a flag or existence.

So, the correct implementation would be:

class to_S3(luigi.Task):

    def requires(self):
        return [Mysql_to_csv()]


    def run(self):

        #Creating a connection
        access_key = ""
        access_secret = ""
        conn = S3Connection(access_key, access_secret)

        #Connecting to the bucket
        bucket_name = ""
        bucket = conn.get_bucket(bucket_name)

        #Setting up the keys
        k = Key(bucket)
        k.key = "sample1"
        k.set_contents_from_filename("../test_data.tsv")

是的,所有非外部luigi任务都需要run()方法。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM