从 IP 低延迟相机获取帧

Question

I am currently using this command to get frames from my RTSP stream and reading frames from stdout:我目前正在使用此命令从我的 RTSP stream 获取帧并从 stdout 读取帧：

ffmpeg -nostdin -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -

However, I would like to get the same latency as when I see it via ffplay:但是，我希望获得与通过 ffplay 看到它时相同的延迟：

ffplay -fflags nobuffer -flags low_delay -tune zerolatency -framedrop -rtsp_transport tcp <rtsp_stream>

or when I play it via VLC Media > Open Network Stream with .network_caching=300ms.或者当我通过 VLC Media > Open Network Stream 和 .network_caching=300ms 播放时。

I would like to know what other parameters I can use with my ffmpeg command to get an equivalent (or better) result compared to the ffplay command.我想知道我的 ffmpeg 命令可以使用哪些其他参数来获得与 ffplay 命令等效（或更好）的结果。

I have made references from: How to dump raw RTSP stream to file?我参考了： How to dump raw RTSP stream to file? , Open CV RTSP camera buffer lag , How to pipe output from ffmpeg using python? , Open CV RTSP camera buffer buffer lag , How to pipe output from ffmpeg using python? , bad ffmpeg performace compared to ffplay and VLC , How to minimize the delay in a live streaming with ffmpeg ，与 ffplay 和 VLC 相比，ffmpeg 的性能较差，如何使用 ffmpeg 最小化直播中的延迟

My current implmentation:我目前的实施：

FFMPEG_CMD = "ffmpeg -nostdin -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -".split(" ")
WIDTH = 2560
HEIGHT = 1440

process = subprocess.Popen(FFMPEG_CMD, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)

while True:
    raw_frame = process.stdout.read(WIDTH*HEIGHT*3)
    frame = np.frombuffer(raw_frame, np.uint8) 
    frame = frame.reshape((HEIGHT, WIDTH, 3))

    <do stuff with frame/ show frame etc.>

Thanks for reading.谢谢阅读。

ffmpeg command I am now using for < 1s latency.我现在使用的ffmpeg命令的延迟时间小于 1 秒。

ffmpeg -nostdin -flags low_delay -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -

Implementation with suggestion(s) from Answers:根据答案的建议实施：

import subprocess
import numpy as np

FFMPEG_CMD = "ffmpeg -nostdin -flags low_delay -rtsp_transport tcp -i <rtsp_stream> -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo -".split(" ")
WIDTH = 2560
HEIGHT = 1440

process = subprocess.Popen(FFMPEG_CMD, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)

raw_frame = np.empty((HEIGHT, WIDTH, 3), np.uint8) 
frame_bytes = memoryview(raw_frame).cast("B")

while process.poll() is None:
    process.stdout.readinto(frame_bytes)
    frame = raw_frame.reshape((HEIGHT, WIDTH, 3))

    <do stuff with frame/ show frame etc.>

Answer 1

I did some research about reducing the video latency.我做了一些关于减少视频延迟的研究。
My following answer demonstrates that the relevant FFmpeg flags are -probesize 32 and -flags low_delay .我的以下回答表明相关的 FFmpeg 标志是-probesize 32和-flags low_delay 。

The above flags are relevant for the video decoder side (receiver side).上述标志与视频解码器端（接收端）相关。

The video encoding parameters "transmitter / encoder side" is more significant for determining the end to end latency.视频编码参数“发送器/编码器端”对于确定端到端延迟更为重要。
Adding the argument -tune zerolatency reduces the encoder latency to minimum, but the required bandwidth is much higher (and probably not relevant for streaming over the inte.net).添加参数-tune zerolatency可将编码器延迟降至最低，但所需带宽要高得多（并且可能与通过 inte.net 进行流式传输无关）。
I am going to restrict my answer to decoding latency, because it seems more relevant to the topic of your question.我将限制我对解码延迟的回答，因为它似乎与您的问题主题更相关。

The subject regarding "know how others obtain video frames with low latency" is a subject for a separate question (and I don't know the answer).关于“知道其他人如何获得低延迟的视频帧”的主题是一个单独问题的主题（我不知道答案）。

For comparing the latency differences between FFplay and FFmpeg (decoder), I created a "self contained" test sample.为了比较 FFplay 和 FFmpeg（解码器）之间的延迟差异，我创建了一个“自包含”测试样本。

Main "principles":主要“原则”：

Execute FFmpeg sub-process for streaming two RTSP output streams in parallel.执行 FFmpeg 子进程以并行传输两个 RTSP output 流。
The streamed video is synthetic pattern with frame counter as text over the video.流式视频是合成模式，帧计数器作为视频上的文本。
The two output streams applies the same encoding parameters (only the port is different). output这两个流应用相同的编码参数（只是端口不同）。
The RTSP IP address is 127.0.0.1 (localhost). RTSP IP 地址是127.0.0.1 (localhost)。
(Note: We may use tee muxer instead of encoding twice, but I never tried it). （注意：我们可能会使用tee muxer而不是编码两次，但我从未尝试过）。
Execute FFplay sub-process that to decode and display one video stream.执行FFplay子进程解码并显示一段视频stream。
Execute FFmpeg sub-process that to decode the other video stream.执行 FFmpeg 子进程解码另一个视频 stream。
OpenCV imshow is used for displaying the video. OpenCV imshow用于显示视频。
The displayed video with the larger counter is the one with the lower latency.具有较大计数器的显示视频是具有较低延迟的视频。

Code sample (updated):代码示例（更新）：

import cv2
import numpy as np
import subprocess as sp
import shlex


rtsp_stream0 = 'rtsp://127.0.0.1:21415/live.stream'  # Use localhost for testing 
rtsp_stream1 = 'rtsp://127.0.0.1:31415/live.stream'
width = 256  # Use low resolution (for testing).
height = 144
fps = 30

# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')


# FFplay command before updating the code (latency is still too high):  
# ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# Updated FFplay command - adding "-vf setpts=0" (fixing the latency issue):
# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -vf setpts=0 -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# Execute FFplay to used as reference
ffplay_process = sp.Popen(ffplay_cmd)

# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL


# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
                                               '-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
                                               'split[v0][v1] '  # Split the input into [v0] and [v1]
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v0]" -an {rtsp_stream0} '
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v1]" -an {rtsp_stream1}'))


while True:
    raw_frame = process.stdout.read(width*height*3)

    if len(raw_frame) != (width*height*3):
        print('Error reading frame!!!')  # Break the loop in case of an error (too few bytes were read).
        break

    # Transform the byte read into a numpy array, and reshape it to video frame dimensions
    frame = np.frombuffer(raw_frame, np.uint8)
    frame = frame.reshape((height, width, 3))

    # Show frame for testing
    cv2.imshow('frame', frame)
    key = cv2.waitKey(1)

    if key == 27:
        break
  
process.stdout.close()
process.wait()
ffplay_process.kill()
rtsp_streaming_process.kill()
cv2.destroyAllWindows()

Sample output before adding adding -vf setpts=0 :在添加-vf setpts=0之前采样 output：

Sample output (left side is OpenCV and right side is FFplay):样本output（左边是OpenCV，右边是FFplay）：

It looks like FFmpeg-OpenCV latency is lower by 6 frames before adding -vf setpts=0 to FFplay command.在将-vf setpts=0添加到 FFplay 命令之前，FFmpeg-OpenCV 延迟似乎降低了6 帧。

Note: It took me some time to find the solution, and I decided to keep the result of the original post for showing the importance of adding the setpts filter.注意：我花了一些时间才找到解决方案，我决定保留原始帖子的结果以显示添加setpts过滤器的重要性。

Update:更新：

Adding -vf setpts=0 solved the latency issue.添加-vf setpts=0解决了延迟问题。

The latest answer from the following post suggests to add setpts video filter that resets all the video timestamps to zero. 以下帖子的最新答案建议添加setpts视频过滤器，将所有视频时间戳重置为零。
It may not be a good idea with the present of audio stream, but when lowest video latency is required, this is the best solution I could find.音频 stream 的存在可能不是一个好主意，但是当需要最低的视频延迟时，这是我能找到的最佳解决方案。

After adding -vf setpts=0 the latency of FFplay and OpenCV is about the same:添加-vf setpts=0后，FFplay 和 OpenCV 的延迟大致相同：

Repeating the test with mpv media player :用mpv 媒体播放器重复测试：

(Note: It seemed more relevant before I found the FFplay solution). （注意：在我找到FFplay解决方案之前，它似乎更相关）。

When applying all the mpv "latency hacks" from this page , the latency of mpv and OpenCV is about the same:当应用此页面中的所有 mpv“延迟黑客”时，mpv 和 OpenCV 的延迟大致相同：

There must be a solution with FFplay, but I can't find it...肯定有 FFplay 的解决方案，但我找不到它...

Code sample (using mpv instead of FFplay):代码示例（使用 mpv 而不是 FFplay）：

import cv2
import numpy as np
import subprocess as sp
import shlex

rtsp_stream0 = 'rtsp://127.0.0.1:21415/live.stream'  # Use localhost for testing 
rtsp_stream1 = 'rtsp://127.0.0.1:31415/live.stream'
width = 256  # Use low resolution (for testing).
height = 144
fps = 30

# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')

# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
#ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')

# https://github.com/mpv-player/mpv/issues/4213
mpv_cmd = shlex.split(f'mpv --demuxer-lavf-o=rtsp_flags=listen --rtsp-transport=tcp --profile=low-latency --no-cache --untimed --no-demuxer-thread --vd-lavc-threads=1 {rtsp_stream1}')

# Execute FFplay to used as reference
#ffplay_process = sp.Popen(ffplay_cmd)

# Execute mpv media player (as reference)
mpv_process = sp.Popen(mpv_cmd)

# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL


# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
                                               '-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
                                               'split[v0][v1] '  # Split the input into [v0] and [v1]
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v0]" -an {rtsp_stream0} '
                                               '-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
                                              f'-map "[v1]" -an {rtsp_stream1}'))


while True:
    raw_frame = process.stdout.read(width*height*3)

    if len(raw_frame) != (width*height*3):
        print('Error reading frame!!!')  # Break the loop in case of an error (too few bytes were read).
        break

    # Transform the byte read into a numpy array, and reshape it to video frame dimensions
    frame = np.frombuffer(raw_frame, np.uint8)
    frame = frame.reshape((height, width, 3))

    # Show frame for testing
    cv2.imshow('frame', frame)
    key = cv2.waitKey(1)

    if key == 27:
        break
  
process.stdout.close()
process.wait()
#ffplay_process.kill()
mpv_process.kill()
rtsp_streaming_process.kill()
cv2.destroyAllWindows()

Answer 2

Assuming that the bottleneck is indeed somewhere in your example code (and not in <do stuff with frame/ show frame etc.> ), you can try to update numpy array as opposed to creating one every time:假设瓶颈确实在您的示例代码中的某处（而不是在<do stuff with frame/ show frame etc.>中），您可以尝试更新 numpy 数组而不是每次都创建一个：

frame = np.empty((HEIGHT, WIDTH, 3), np.uint8) 
frame_bytes = memoryview(frame).cast("b")
while True:
    process.stdout.readinto(frame_bytes) # fills the buffer of frame
    ...

从 IP 低延迟相机获取帧

问题描述

2 个解决方案

解决方案1
4 2022-04-05 21:42:29

Update:更新：

解决方案2
1 2022-04-05 19:08:34

从 IP 低延迟相机获取帧

问题描述

2 个解决方案

解决方案1 4 2022-04-05 21:42:29

Update:更新：

解决方案2 1 2022-04-05 19:08:34

解决方案1
4 2022-04-05 21:42:29

解决方案2
1 2022-04-05 19:08:34