简体   繁体   English

子进程参数列表为long

[英]Subprocess argument list to long

I have a third party executable called by using subprocess.check_output unfortunately my argument list is too long and repeatedly calling it is much slower than calling it once with many arguments. 我有一个使用subprocess.check_output调用的第三方可执行文件,遗憾的是我的参数列表太长并且反复调用它比用多个参数调用它要慢得多。

Slow due to making the command call many times: 由于多次执行命令调用而变慢:

def call_third_party_slow(third_party_path, files):
    for file in files:
        output = subprocess.check_output([third_party_path, "-z", file])
        if "sought" in decode(output):
            return False
    return True

Fast but fails when there are many files: 快速但有很多文件时失败:

def call_third_party_fast(third_party_path, files):
    command = [third_party_path, "-z"]
    command.extend(files) 
    output = subprocess.check_output(command)
    if "sought" in decode(output):
        return False
    return True

Is there any easy way I can work around the command length limit or easily group the files to avoid exceeding the os dependent length? 有没有简单的方法可以解决命令长度限制或轻松分组文件以避免超过os依赖长度?

You could batch the files list like this: 您可以像这样批处理文件列表:

def batch_args(args, arg_max):
    current_arg_length = 0
    current_list = []
    for arg in args:
        if current_arg_length + len(arg) + 1 > arg_max:
            yield current_list
            current_list = [arg]
            current_arg_length = len(arg)
        else:
            current_list.append(arg)
            current_arg_length += len(arg) + 1
    if current_list:
        yield current_list

So the method body would look like this: 所以方法体看起来像这样:

os_limit = 10
for args in batch_args(files, os_limit):
    command = [third_party_path, "-z"]
    command.extend(args) 
    output = subprocess.check_output(command)
    if "sought" in decode(output):
        return False
return True

Two things I'm not sure about: 我不确定的两件事:

  1. Does the path to the exe itself count towards the limit? exe的路径是否计入限制? If yes -> add that to the limit each batch. 如果是 - >将其添加到每个批次的限制。 (Or decrease arg_max by the length of the exe string) (或者按照exe字符串的长度减少arg_max)
  2. Does the space between args count? args之间的空间是否计算? If not remove both +1 occurences. 如果不同时删除+1两次出现。

Adjust arg_max to what is possible. 将arg_max调整为可能的值。 Probably there is some way of finding this out per OS. 可能有一些方法可以找到每个操作系统。 Here is some info about the max args size of some OSs. 这里有一些关于某些操作系统的最大args大小的信息。 That site also states there is a 32k limit for windows. 该网站还声明Windows有32k的限制。

Maybe there is a better way to do it using the subprocess library, but I'm not sure. 也许使用子进程库有更好的方法,但我不确定。

Also I'm not doing any exception handling (args in list longer than max size, etc.) 此外,我没有做任何异常处理(列表中的args超过最大大小等)

I solved this by using a temporary file on windows. 我通过在Windows上使用临时文件解决了这个问题。 For Linux the command could be executed as is. 对于Linux,命令可以按原样执行。

Method to build the full command for the different plattforms: 为不同的平台构建完整命令的方法:

import tempfile

temporary_file = 0
def make_full_command(base_command, files):
    command = list(base_command)

    if platform.system() == "Windows":
        global temporary_file
        temporary_file = tempfile.NamedTemporaryFile()
        posix_files = map((lambda f: f.replace(os.sep, '/')),files)
        temporary_file.write(str.encode(" ".join(posix_files)))
        temporary_file.flush()
        command.append("@" + temporary_file.name)
    else:
        command.extend(files)
    return command

Usage of the file as a global variable ensures it is cleaned up after the execution. 将文件用作全局变量可确保在执行后清除它。

This way I didn't have to find the max command length for different OSes 这样我就不必为不同的操作系统找到最大命令长度

If you don't want to reinvent an optimal solution, use a tool which already implements exactly this: xargs . 如果您不想重新发明最佳解决方案,请使用已经实现此功能的工具: xargs

def call_third_party_slow(third_party_path, files):
    result = subprocess.run(['xargs', '-r', '-0', third_party_path, '-z'],
        stdin='\0'.join(files) + '\0', stdout=subprocess.PIPE,
        check=True, universal_newlines=True)
    if "sought" in result.stdout:
        return False
    return True

You'll notice I also switched to subprocess.run() , which is available in Python 3.5+ 您会注意到我也切换到了subprocess.run() ,它在Python 3.5+中可用

If you do want to reimplement xargs you will need to find the value of the kernel constant ARG_MAX and build a command-line list whose size never exceeds this limit. 如果您确实要重新实现xargs ,则需要找到内核常量ARG_MAX的值,并构建一个大小永远不会超过此限制的命令行列表。 Then you could check after each iteration if the output contains sought , and quit immediately if it does. 然后你可以在每次迭代后检查输出是否包含sought ,如果有,则立即退出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM