简体   繁体   English

读取.h264文件

[英]read a .h264 file

I'll be happy for your help with some problem that I have. 对于您遇到的某些问题,我将很高兴为您提供帮助。

Goal: To read a .h264 file (I extracted the raw bitstream to a file using ffmpeg) using python, and save it in some data structure (probably a list, I'll be happy for suggestions). 目标:使用python读取.h264文件(我使用ffmpeg将原始位流提取到文件中),并将其保存在某些数据结构中(可能是列表,我很乐意提供建议)。

I want to read the data as hexa, for example I'll show how the data looks like: 我想将数据读取为hexa,例如,我将显示数据的外观: 在此处输入图片说明

What I want is to feed each byte(2 hexa digits), into a list, or some other data structure. 我想要的是将每个字节(2个六位数),列表或其他一些数据结构提供给我。 But any step forward will help me. 但是前进的任何一步都会对我有所帮助。

My Attempts: First I tried to read the way I know: 我的尝试:首先,我尝试阅读我所知道的方式:

with open(path, 'r') as fp:
     data = fp.read()

Didn't work, got just ". 没有工作,只有“。

After a lot of changes, I tried something else, I saw online: 经过大量更改后,我尝试了其他操作,在网上看到了:

    with open(path, 'r') as fp:
    hex_list = ["{:02}".format(ord(c)) for c in fp.read()]

Still got an empty list. 仍然有一个空列表。

I'll be happy for you help. 我会很高兴为您提供帮助。 Thanks a lot. 非常感谢。

EDIT : Thanks to the comment below, I tried to open using 'rb', but still with no luck. 编辑 :由于下面的评论,我试图使用'rb'打开,但仍然没有运气。

If you have an h264 mp4 file, you can open it and get a hexadecimal string representation like this using binascii.hexlify() : 如果您有一个h264 mp4文件,则可以使用binascii.hexlify()其打开并获取一个十六进制的字符串表示形式:

import binascii
with open('test.mp4', 'rb') as fin:
    hexa = binascii.hexlify(fin.read())
    print(hexa[0:1000])

hexa will be a python bytes object, and you can easily get back the binary representation by doing binascii.unhexlify(hexa) . hexa将是一个python bytes对象,您可以通过执行binascii.unhexlify(hexa)轻松获取二进制表示形式。 This will be much more efficient than storing the hex representation as strings in a list() , both in terms of space and time. 这在空间和时间方面都比将十六进制表示形式存储为list()字符串的效率更高。 You can access the bytes array with indices/slices, so whatever you were intending to do with the list will probably work fine with this (it will just be much faster and use a lot less memory). 您可以使用索引/切片访问bytes数组,因此,无论您打算对列表进行什么操作,都可以很好地使用它(它将更快,并且使用更少的内存)。

One thing to keep in mind though is to get the the first hexadecimal digit from a bytes object, you don't do hexa[0] , but rather hexa[0:1] . 不过要记住的一件事是从bytes对象中获取第一个十六进制数字,而不是hexa[0] ,而是hexa[0:1] To get the first pair of hexadecimal digits (byte), you do: hexa[0:2] . 要获取第一对十六进制数字(字节),请执行: hexa[0:2] The second byte is hexa[2:4] etc. As explained in the docs for hex() : 第二个字节是hexa[2:4]等。如文档中hex()

Since bytes objects are sequences of integers (akin to a tuple), for a bytes object b, b[0] will be an integer, while b[0:1] will be a bytes object of length 1. (This contrasts with text strings, where both indexing and slicing will produce a string of length 1) 由于字节对象是整数序列(类似于元组),因此对于字节对象b,b [0]将是整数,而b [0:1]将是长度为1的字节对象。(这与文本相反字符串,其中索引和切片都会产生长度为1的字符串)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM