简体   繁体   中英

Python how to pass in an optional filename parameter when running a script to process a file

I have a python script that processes an XML file each day (it is transferred via SFTP to a remote directory, then temporarily copied to a local directory) and stores its information in a MySQL database.

One of my parameters for the file is set to "date=today" so that the correct file is processed each day. This works fine and each day I successfully store new file information into the database.

What I need help on is passing a Linux command line argument to run a file for a specific day (in case a previous day's file needs to be rerun). I can manually edit my code to make this work but this will not be an option once the project is in production.

In addition, I need to be able to pass in a command line argument for "date=*" and have the script run every file in my remote directory. Currently, this parameter will successfully process only a single file based on alphabetic priority.

If my two questions should be asked separately, my mistake, and I'll edit this question to just cover one of them. Example of my code below:

today = datetime.datetime.now().strftime('%Y%m%d')

    file_var = local_file_path + connect_to_sftp.sftp_get_file(
                                               local_file_path=local_file_path,
                                               sftp_host=sftp_host,
                                               sftp_username=sftp_username,
                                               sftp_directory=sftp_directory,
                                               date=today)

    ET = xml.etree.ElementTree.parse(file_var).getroot()

    def parse_file():
        for node in ET.findall(.......)

In another module:

    def sftp_get_file(local_file_path, sftp_host, sftp_username, sftp_directory, date):

        pysftp.Connection(sftp_host, sftp_username)

        # find file in remote directory with given suffix
        remote_file = glob.glob(sftp_directory + '/' + date + '_file_suffix.xml')

        # strip directory name from full file name
        file_name_only = remote_file[0][len(sftp_directory):]

        # set local path to hold new file
        local_path = local_file_path

        # combine local path with filename that was loaded
        local_file = local_path + file_name_only

        # pull file from remote directory and send to local directory
        shutil.copyfile(remote_file[0], local_file)

        return file_name_only

So the SFTP module reads the file, transfers it to the local directory, and returns the file name to be used in the parsing module. The parsing module passes in the parameters and does the rest of the work.

What I need to be able to do, on certain occasions, is override the parameter that says "date=today" and instead say "date=20151225", for example, but I must do this through a Linux command line argument.

In addition, if I currently enter the parameter of "date=*" it only runs the script for the first file that matches that parameter. I need the script to run for ALL files that match that parameter. Any help is much appreciated. Happy to answer any questions to improve clarity.

You can use sys module and pass the filename as command line argument.

That would be :

import sys

today = str(sys.argv[1]) if len(sys.argv) > 1 else datetime.datetime.now().strftime('%Y%m%d')

If the name is given as first argument, then today variable will be filename given from command line otherwise if no argument is given it will be what you specified as datetime .

For second question,

   file_name_only = remote_file[0][len(sftp_directory):]

You are only accessing the first element, but glob might return serveral files when you use * wildcard. You must iterate over remote_file variable and copy all of them.

You can use argsparse to consume command line arguments. You will have to check if specific date is passed and use it instead of the current date

if args.date_to_run:
    today = args.date_to_run
else:
    today = datetime.datetime.now().strftime('%Y%m%d')

For the second part of your question you can use something like https://docs.python.org/2/library/fnmatch.html to match multiple files based on a pattern.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM