从 IP 低延迟相机获取帧

[英]Obtaining frames from IP Camera with low latency

我目前正在使用此命令从我的 RTSP stream 获取帧并从 stdout 读取帧:

ffmpeg -nostdin -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -

但是,我希望获得与通过 ffplay 看到它时相同的延迟:

ffplay -fflags nobuffer -flags low_delay -tune zerolatency -framedrop -rtsp_transport tcp <rtsp_stream>

或者当我通过 VLC Media > Open Network Stream 和 .network_caching=300ms 播放时。

我想知道我的 ffmpeg 命令可以使用哪些其他参数来获得与 ffplay 命令等效(或更好)的结果。

我参考了: How to dump raw RTSP stream to file? , Open CV RTSP camera buffer buffer lag , How to pipe output from ffmpeg using python? 与 ffplay 和 VLC 相比,ffmpeg 的性能较差如何使用 ffmpeg 最小化直播中的延迟


FFMPEG_CMD = "ffmpeg -nostdin -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -".split(" ")
WIDTH = 2560
HEIGHT = 1440

process = subprocess.Popen(FFMPEG_CMD, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)

while True:
    raw_frame = process.stdout.read(WIDTH*HEIGHT*3)
    frame = np.frombuffer(raw_frame, np.uint8) 
    frame = frame.reshape((HEIGHT, WIDTH, 3))

    <do stuff with frame/ show frame etc.>


我现在使用的ffmpeg命令的延迟时间小于 1 秒。

ffmpeg -nostdin -flags low_delay -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -


import subprocess
import numpy as np

FFMPEG_CMD = "ffmpeg -nostdin -flags low_delay -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -".split(" ")
WIDTH = 2560
HEIGHT = 1440

process = subprocess.Popen(FFMPEG_CMD, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)

raw_frame = np.empty((HEIGHT, WIDTH, 3), np.uint8) 
frame_bytes = memoryview(raw_frame).cast("B")

while process.poll() is None:
    frame = raw_frame.reshape((HEIGHT, WIDTH, 3))

    <do stuff with frame/ show frame etc.>

我的以下回答表明相关的 FFmpeg 标志是-probesize 32-flags low_delay


添加参数-tune zerolatency可将编码器延迟降至最低,但所需带宽要高得多(并且可能与通过 inte.net 进行流式传输无关)。


为了比较 FFplay 和 FFmpeg(解码器)之间的延迟差异,我创建了一个“自包含”测试样本。


  • 执行 FFmpeg 子进程以并行传输两个 RTSP output 流。
    RTSP IP 地址是127.0.0.1 (localhost)。
    (注意:我们可能会使用tee muxer而不是编码两次,但我从未尝试过)。
  • 执行FFplay子进程解码并显示一段视频stream。
  • 执行 FFmpeg 子进程解码另一个视频 stream。
    OpenCV imshow用于显示视频。
  • 具有较大计数器的显示视频是具有较低延迟的视频。


import cv2
import numpy as np
import subprocess as sp
import shlex

rtsp_stream0 = 'rtsp://'  # Use localhost for testing 
rtsp_stream1 = 'rtsp://'
width = 256  # Use low resolution (for testing).
height = 144
fps = 30

# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')

# FFplay command before updating the code (latency is still too high):  
# ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# Updated FFplay command - adding "-vf setpts=0" (fixing the latency issue):
# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -vf setpts=0 -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# Execute FFplay to used as reference
ffplay_process = sp.Popen(ffplay_cmd)

# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL

# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
                                               '-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
                                               'split[v0][v1] '  # Split the input into [v0] and [v1]
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v0]" -an {rtsp_stream0} '
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v1]" -an {rtsp_stream1}'))

while True:
    raw_frame = process.stdout.read(width*height*3)

    if len(raw_frame) != (width*height*3):
        print('Error reading frame!!!')  # Break the loop in case of an error (too few bytes were read).

    # Transform the byte read into a numpy array, and reshape it to video frame dimensions
    frame = np.frombuffer(raw_frame, np.uint8)
    frame = frame.reshape((height, width, 3))

    # Show frame for testing
    cv2.imshow('frame', frame)
    key = cv2.waitKey(1)

    if key == 27:

在添加-vf setpts=0之前采样 output:


在将-vf setpts=0添加到 FFplay 命令之前,FFmpeg-OpenCV 延迟似乎降低了6 帧



添加-vf setpts=0解决了延迟问题。

音频 stream 的存在可能不是一个好主意,但是当需要最低的视频延迟时,这是我能找到的最佳解决方案。

添加-vf setpts=0后,FFplay 和 OpenCV 的延迟大致相同:


mpv 媒体播放器重复测试:


当应用此页面中的所有 mpv“延迟黑客”时,mpv 和 OpenCV 的延迟大致相同:


肯定有 FFplay 的解决方案,但我找不到它...

代码示例(使用 mpv 而不是 FFplay):

import cv2
import numpy as np
import subprocess as sp
import shlex

rtsp_stream0 = 'rtsp://'  # Use localhost for testing 
rtsp_stream1 = 'rtsp://'
width = 256  # Use low resolution (for testing).
height = 144
fps = 30

# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')

# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
#ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# https://github.com/mpv-player/mpv/issues/4213
mpv_cmd = shlex.split(f'mpv --demuxer-lavf-o=rtsp_flags=listen --rtsp-transport=tcp --profile=low-latency --no-cache --untimed --no-demuxer-thread --vd-lavc-threads=1 {rtsp_stream1}')

# Execute FFplay to used as reference
#ffplay_process = sp.Popen(ffplay_cmd)

# Execute mpv media player (as reference)
mpv_process = sp.Popen(mpv_cmd)

# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL

# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
                                               '-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
                                               'split[v0][v1] '  # Split the input into [v0] and [v1]
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v0]" -an {rtsp_stream0} '
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v1]" -an {rtsp_stream1}'))

while True:
    raw_frame = process.stdout.read(width*height*3)

    if len(raw_frame) != (width*height*3):
        print('Error reading frame!!!')  # Break the loop in case of an error (too few bytes were read).

    # Transform the byte read into a numpy array, and reshape it to video frame dimensions
    frame = np.frombuffer(raw_frame, np.uint8)
    frame = frame.reshape((height, width, 3))

    # Show frame for testing
    cv2.imshow('frame', frame)
    key = cv2.waitKey(1)

    if key == 27:

假设瓶颈确实在您的示例代码中的某处(而不是在<do stuff with frame/ show frame etc.>中),您可以尝试更新 numpy 数组而不是每次都创建一个:

frame = np.empty((HEIGHT, WIDTH, 3), np.uint8) 
frame_bytes = memoryview(frame).cast("b")
while True:
    process.stdout.readinto(frame_bytes) # fills the buffer of frame


