简体   繁体   English

如何获取可变数量的文件作为python脚本的输入?

[英]How can I take a variable number of files as input for a python script?

For example I want to be able to run this hypothetical command: 例如,我希望能够运行以下假设命令:

$ python script.py *.txt option1 option2

And have it execute on every file that matches *.txt 并在与* .txt匹配的每个文件上执行

Currently I have only found information on operating on one file at a time 目前,我仅发现一次操作一个文件的信息

from sys import argv

self, file, option1, option2 = argv

perform_operation(file, option1, option2)

#function definition

You want to use the argparse-module: 您要使用argparse-module:

import argparse

parser = argparse.ArgumentParser()

parser.add_argument("--option1")
parser.add_argument("--option2")
parser.add_argument("files", nargs="+")

opts = parser.parse_args()

print opts.option1
print opts.option2
print opts.files

Use like this: 像这样使用:

 beer:~ deets$ python2.7 /tmp/argparse-test.py  text foo bar baz
 None
 None
 ['text', 'foo', 'bar', 'baz']

argv is a list. argv是一个列表。 Let's assume that you are only going to pass filename arguments. 假设您仅要传递文件名参数。 If it's more complicated, then go with deets' answer. 如果更复杂,则请按Deets的回答进行。

self = sys.argv[0]
arguments = sys.argv[1:]

Now, arguments is a list of program arguments. 现在, arguments是程序参数的列表。 Let's say we want to process them one at a time: 假设我们要一次处理它们:

for argument in arguments:
    work(argument)

Or we want to pass all of them to a function: 或者我们想将它们全部传递给一个函数:

work(arguments)

As to passing *.txt as an argument. 至于传递*.txt作为参数。 Your shell (before your program even runs) will do most of the work for you. 您的外壳程序(甚至在程序运行之前)将为您完成大部分工作。

If I run, python program.py *.txt where *.txt refers to 3 text files, then my shell will expand that such that my program will see python program.py a.txt b.txt c.txt . 如果我运行python program.py *.txt ,其中*.txt指向3个文本文件,那么我的shell将会扩展为使我的程序看到python program.py a.txt b.txt c.txt

multifile.py

"""
Usage:
    multifile.py <file>...
    multifile.py -h

Prints something about all the <file>... files.
"""

def main(files):
    for fname in files:
        print fname

if __name__ == "__main__":
    from docopt import docopt
    args = docopt(__doc__)
    files = args["<file>"]
    main(files)

Use it 用它

Install docopt first 首先安装docopt

$ pip install docopt

Call the command without arguments: 调用不带参数的命令:

$ python multifile.py
Usage:
    multifile.py <file>...
    multifile.py -h

Try help 尝试帮助

$ python multifile.py -h
Usage:
    multifile.py <file>...
    multifile.py -h

Prints something about all the <file>... files.

Use it for one file: 将其用于一个文件:

$ python multifile.py alfa.py 
alfa.py

Use it for multiple files, using wildcards: 使用通配符将它用于多个文件:

$ python multifile.py ../*.py

    ../camera2xml.py
    ../cgi.py
    ../classs.py

Conclusions 结论

  • docopt allows really many more options (see docopt ) docopt允许更多选项(请参阅docopt
  • command line parsing can be easy in Python 命令行解析在Python中很容易
    • argparse seems is standard part of Python since version 2.7 自2.7版以来, argparse似乎是Python的标准部分
    • argparse can do a lot, but requires rather complex calls on many lines argparse可以做很多事情,但是需要在许多行上进行相当复杂的调用
    • plac is nice alternative, can server quickly in most cases plac是不错的选择,在大多数情况下可以快速进行服务
    • docopt seems to me to be the most flexible and at the same time shortest in required lines of code 在我看来docopt在所需的代码行中最灵活,同时最短

Using inputfile from stdlib 使用stdlib中的inputfile

There is one library in stdlib, which is often overlooked, called inputfile . stdlib中有一个通常被忽略的库,称为inputfile

It by default handles all input on command line or from stdin as filenames and allows iterating not only over those files, but also over all the lines in it, modifying them, decompressing and many other practical things. 默认情况下,它会将命令行或标准输入中的所有输入作为文件名处理,并且不仅允许遍历这些文件,还允许遍历其中的所有行,对其进行修改,解压缩以及许多其他实际操作。

filenames.py - list all the filenames filenames.py列出所有文件名

import fileinput

for line in fileinput.input():
    print "File name is: ", fileinput.filename()
    fileinput.nextfile()

Call it: 称它为:

$ python filenames.py *.txt
File name is: films.txt
File name is: highscores.txt
File name is: Logging.txt
File name is: outtext.txt
File name is: text.txt

upperlines.py - print all lines from multiple files in uppercase upperlines.py用大写字母打印多个文件中的所有行

import fileinput

for line in fileinput.input():
    print line.upper(),

and call it: 并称之为:

$ python upperlines.py *.txt
THE SHAWSHANK REDEMPTION (1994)
THE GODFATHER (1972)
THE GODFATHER: PART II (1974)
THE DARK KNIGHT (2008)
PULP FICTION (1994)
JAN HAS SCORE OF 101
PIETER HAS SCORE OF 900
CYRIL HAS SCORE OF 2
2014 APR 11  07:14:03.155  SECTORBLAH
   INTERESTINGCONTENT
   INTERESTING1 = 843
1. LUV_DEV <- HE'S A DEVELOPER
2. AMIT_DEV <- HE'S A DEVELOPER
....

upperlinesinplace.py - turn all lines in files into uppercase upperlinesinplace.py将文件中的所有行都转换为大写

import fileinput

for line in fileinput.input(inplace=True):
    print line.upper(),

Conclusions 结论

  • fileinput takes as default argument sys.argv[:1] and iterates over all files and lines fileinput作为默认参数sys.argv[:1]并遍历所有文件和行
  • you can pass your own list of filenames to process 您可以传递自己的文件名列表进行处理
  • fileinput allows inplace changing, filtering, reading file names, line numbers... fileinput允许就地更改,过滤,读取文件名,行号...
  • fileinput even allows processing compressed files fileinput甚至允许处理压缩文件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM