简体   繁体   English

python-在多个文件上运行脚本

[英]python- run script on multiple files

I have a python script which takes the filename as a command argument and processes that file. 我有一个python脚本,它将文件名作为命令参数并处理该文件。 However, i have thousands of files I need to process, and I would like to run the script on every file without having to add the filename as the argument each time. 但是,我需要处理数千个文件,我想在每个文件上运行脚本,而不必每次都添加文件名作为参数。

The script works well when run on an individual file like this: 在这样的单个文件上运行时,该脚本运行良好:

myscript.py /my/folder/of/stuff/text1.txt

I have this code to do them all at once, but it doesn't work 我有这个代码一次完成它们,但它不起作用

for fname in glob.iglob(os.path.join('folder/location')):
    proc = subprocess.Popen([sys.executable, 'script/location.py', fname])
    proc.wait()

Whenever I run the above code, it doesn't throw an error, but doesn't give me the intended output. 每当我运行上面的代码时,它不会抛出错误,但不会给我预期的输出。 I think the problem lies with the fact that the script is expecting the path to a .txt file as an argument, and the code is only giving it the folder that the file is sitting in (or at least not a working absolute reference). 我认为问题在于脚本期望将.txt文件的路径作为参数,并且代码只给它文件所在的文件夹(或者至少不是工作的绝对引用)。

How to correct this problem? 如何纠正这个问题?

If the files are in the same folder and if the script supports it, you could use that syntax : 如果文件位于同一文件夹中,并且脚本支持该文件,则可以使用该语法:

myscript.py /my/folder/of/stuff/*.txt

The wild card will be replaced by the corresponding files. 通配符将被相应的文件替换。

If the script doesn't support it, isolate the process like in this quick example : 如果脚本不支持它,请在此快速示例中隔离该过程:

import sys

def printFileName(filename):
  print filename

def main():
  args = sys.argv[1:]
  for filename in args:
    printFileName(filename)

if __name__ == '__main__':
  main()

Then from the console, you can start it like that : 然后从控制台,您可以这样启动它:

python MyScript.py /home/andy/tmp/1/*.txt /home/andy/tmp/2/*.html

This will print the pathes of all the files in both folders. 这将打印两个文件夹中所有文件的pathes。

Hope this can be of some help. 希望这可以提供一些帮助。

You can write another script to do this. 您可以编写另一个脚本来执行此操作。 This is just a work around, try using os.walk 这只是一个解决方法,尝试使用os.walk

import sys, os
for root, dir, files in os.walk(PATH):
    for file in files:
        os.system ('myscript.py {}'.format(root + '\\' + file))

Provide the PATH of the whole folder to os.walk , it parses all the files in the directory. 将整个文件夹的PATH提供给os.walk ,它解析目录中的所有文件。

If you want to parse specific files, say for example only files with .cpp files, then you can filter the file names like this. 如果要解析特定文件,例如只说文件包含.cpp文件,则可以像这样过滤文件名。 add this after the for file in files for file in filesfor file in files后添加此项

if file.endswith('.cpp'):

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM