如何使用字节而不是文件的 python 子进程

Question

I can convert a mp4 to wav, using ffmpeg , by doing this:我可以使用ffmpeg将 mp4 转换为 wav，方法是：

ffmpeg -vn test.wav  -i test.mp4

I can also use subprocess to do the same, as long as my input and output are filepaths.我也可以使用subprocess来做同样的事情，只要我的输入和 output 是文件路径。

But what if I wanted to use ffmpeg directly on bytes or a "file-like" object like io.BytesIO() ?但是，如果我想直接在字节上使用ffmpeg或像io.BytesIO()这样的“类文件” object 怎么办？

Here's an attempt at it:这是一个尝试：

import subprocess
from io import BytesIO
b = BytesIO()

with open('test.mp4', 'rb') as stream:
    command = ['ffmpeg', '-i']
    proc = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=b)
    proc.communicate(input=stream.read())
    proc.wait()
    proc.stdin.close()
    proc.stdout.close()

Gives me:给我：

---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
<ipython-input-84-0ddce839ebc9> in <module>
      5 with open('test.mp4', 'rb') as stream:
      6     command = ['ffmpeg', '-i']
----> 7     proc = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=b)
...
   1486                 # Assuming file-like object
-> 1487                 c2pwrite = stdout.fileno()
   1488 
   1489             if stderr is None:

UnsupportedOperation: fileno

Of course, I could use temp files to funnel my bytes, but I'd like to be able to avoid writing to the disk (because this step is just one link in a pipeline of transformations).当然，我可以使用临时文件汇集我的字节，但我希望能够避免写入磁盘（因为这一步只是转换管道中的一个环节）。

Answer 1

Base on @thorwhalen's answer, here's how it would work from bytes to bytes.根据@thorwhalen 的回答，这里是逐字节工作的方式。 What you were probably missing @thorwhalen, is the actual pipe-to-pipe way to send and get data when interacting with a process. @thorwhalen 您可能错过的是在与进程交互时发送和获取数据的实际管道到管道方式。 When sending bytes, the stdin should be closed before the process can read from it.发送字节时，stdin 应该在进程可以读取之前关闭。

def from_bytes_to_bytes(
        input_bytes: bytes,
        action: str = "-f wav -acodec pcm_s16le -ac 1 -ar 44100")-> bytes or None:
    command = f"ffmpeg -y -i /dev/stdin -f nut {action} -"
    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        shell=False
    )
    b = b''
    # write bytes to processe's stdin and close the pipe to pass
    # data to piped process
    ffmpeg_cmd.stdin.write(input_bytes)
    ffmpeg_cmd.stdin.close()
    while True:
        output = ffmpeg_cmd.stdout.read()
        if len(output) > 0:
            b += output
        else:
            error_msg = ffmpeg_cmd.poll()
            if error_msg is not None:
                break
    return b

Answer 2

Here is a partial answer: three functions showing how this can be done from file to file (for completeness), from bytes to file, and from file to bytes.这是部分答案：三个函数显示了如何从文件到文件（为了完整性）、从字节到文件以及从文件到字节。 The bytes to bytes solution is fighting back though.字节到字节的解决方案正在反击。

import shlex
import subprocess

def from_file_to_file(input_file: str, output_file: str, action="-f wav -acodec pcm_s16le -ac 1 -ar 44100"):
    command = f"ffmpeg -i {input_file} {action} -vn {output_file}"
    subprocess.call(shlex.split(command))


def from_file_to_bytes(input_file: str, action="-f wav -acodec pcm_s16le -ac 1 -ar 44100"):
    command = f"ffmpeg -i {input_file} {action} -"

    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdout=subprocess.PIPE,
        shell=False
    )
    b = b''
    while True:
        output = ffmpeg_cmd.stdout.read()
        if len(output) > 0:
            b += output
        else:
            error_msg = ffmpeg_cmd.poll()
            if error_msg is not None:
                break
    return b


def from_bytes_to_file(input_bytes, output_file, action="-f wav -acodec pcm_s16le -ac 1"):
    command = f"ffmpeg -i /dev/stdin {action} -vn {output_file}"
    ffmpeg_cmd = subprocess.Popen(
        shlex.split(command),
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        shell=False
    )
    ffmpeg_cmd.communicate(input_bytes)

Answer 3

This is the solution I came up with recently although I had used AWS and GCP bucket objects as the input and output. I'm not an expert on python by any means but this got me the results I was after.这是我最近提出的解决方案，尽管我使用 AWS 和 GCP 存储桶对象作为输入和 output。我无论如何都不是 python 的专家，但这让我得到了我想要的结果。

You need to install ffmpeg on your local machine and add it to the environment variables to have access to ffmpeg.您需要在本地机器上安装ffmpeg，并将其添加到环境变量中才能访问ffmpeg。

If you're using the cloud, ffmpeg comes pre-installed on google cloud functions and there is a Lambda Layer on the repository library for AWS that you can leverage.如果您使用的是云，则 ffmpeg 预装在谷歌云功能上，您可以利用 AWS 的存储库库中的 Lambda 层。

Hopefully someone gets use out of this.希望有人能利用这一点。 :) :)

import subprocess

# tested against 'wav', 'mp3', 'flac', 'mp4'
desired_output = 'mp3'
track_input = 'C:\\Users\\.....\\track.wav'
track_output = f'C:\\Users\\......\\output_track.{desired_output}'

encoded_type = ''
format_for_conversion = desired_output 

if desired_output =='m4a':
    encoded_type= '-c:a aac'
    format_for_conversion = 'adts'

with open(track_input, "rb") as in_track_file:
    data = in_track_file.read()

input_track_data= bytearray(data)

# using pipe:0 refers to the stdin, pipe:1 refers to stdout
ffmpeg_command = f'ffmpeg  -i pipe:0 {encoded_type} -f {format_for_conversion} pipe:1 '

ffmpeg_process = subprocess.Popen(ffmpeg_command, stdin=subprocess.PIPE, stdout=subprocess.PIPE)

output_stream = ffmpeg_process.communicate(input_track_data)
# comes back as a tuple
output_bytes = output_stream[0]

with open(track_output, 'ab') as f:
    delete_content(f)
    f.write(output_bytes)

如何使用字节而不是文件的 python 子进程

问题描述

3 个解决方案

解决方案1
8 已采纳 2020-05-11 16:24:53

解决方案2
3 2020-05-11 12:59:05

解决方案3
2 2020-09-12 13:56:30

如何使用字节而不是文件的 python 子进程

问题描述

3 个解决方案

解决方案1 8 已采纳 2020-05-11 16:24:53

解决方案2 3 2020-05-11 12:59:05

解决方案3 2 2020-09-12 13:56:30

解决方案1
8 已采纳 2020-05-11 16:24:53

解决方案2
3 2020-05-11 12:59:05

解决方案3
2 2020-09-12 13:56:30