简体   繁体   English

逐字节读取 HEIF/HEIC 图像 XMP 元数据

[英]Reading byte by byte HEIF/HEIC images XMP metadata

I am trying to build a native byte parser that given an HEIF image it returns back its metadata (mainly width and height of the image).我正在尝试构建一个本机字节解析器,给定一个 HEIF 图像,它返回它的元数据(主要是图像的宽度和高度)。 I am struggling a lot at the moment finding the right documentation and specs to use for parsing such info.目前我正在努力寻找合适的文档和规范来解析此类信息。 I have to do such thing for both XMP and EXIF metadata, but let's focus only on XMP for now.我必须为 XMP 和 EXIF 元数据做这样的事情,但我们现在只关注 XMP。

What I need is the exact byte structure of where to find what.我需要的是在哪里找到什么的确切字节结构。 According to the HEIF international standard doc ( here ):根据 HEIF 国际标准文档( 此处):

For image items, XMP metadata shall be stored as an item of item_type value 'mime' and content type'application/rdf+xml'.对于图像项目,XMP 元数据应存储为项目类型值为“mime”且内容类型为“application/rdf+xml”的项目。 The body of the item shall be a valid XMP document, in XML form.项目正文应为有效的 XMP 文件,格式为 XML。

Perfect, if I analyse a sample image I can find such marker:完美,如果我分析样本图像,我可以找到这样的标记:

在此处输入图像描述

From now on I can't find anywhere how to get the info I need.从现在开始,我无法在任何地方找到如何获取我需要的信息。 I would expect something saying "the first 2 bytes are the header, with marker 0xFF 0xCE (just an example), the next 2 bytes are the width, and following 2 bytes the height...etc".我希望有人说“前 2 个字节是 header,标记为 0xFF 0xCE(只是一个例子),接下来的 2 个字节是宽度,接下来的 2 个字节是高度......等等”。 In my case I am going by intuition.就我而言,我是凭直觉去做的。 My sample image is of dimensions 8736x5856.我的样本图像尺寸为 8736x5856。 If in the tool I look for Big-Endian 2 byte integer 8736, I can find it:如果在工具中查找 Big-Endian 2 byte integer 8736,我可以找到它:

在此处输入图像描述

And hey, 2 bytes later there is the 5856 height as well:嘿,2 个字节后还有 5856 高度:

在此处输入图像描述

But again, I arrived here by luck and intuition.但同样,我是靠运气和直觉来到这里的。 I need a proper schema that tells me where to find what in such a way that I can traslate it to code.我需要一个合适的模式来告诉我在哪里可以找到我可以将其翻译成代码的方式。

What I think you'r seeing is a "mime" and "ispe" mp4 box as HEIF is ISOBMFF based.我认为您看到的是“mime”和“ispe”mp4 盒,因为 HEIF 是基于 ISOBMFF 的。 I would recommend looking at the file using a mp4 capable tool like mp4dump, HexFiend or fq (note: my tool).我建议使用支持 mp4 的工具查看文件,例如 mp4dump、HexFiend 或fq (注意:我的工具)。 The "ispe" (Image Spatial Extents) box i probably what you want to read. “ispe”(图像空间范围)框可能是您想阅读的内容。

fq does no support ispe box yet but you could read it like this: fq 还不支持 ispe box 但你可以这样阅读它:

$ fq 'grep_by(.type=="ispe").data | tobytes | [.[-8:-4], .[-4:] | tonumber]' file.heif
[
  8736,
  5856
]

So what you need is probably a basic ISOBMFF reader and then look for the "ispe" box and decode it.所以你需要的可能是一个基本的 ISOBMFF 阅读器,然后寻找“ispe”框并对其进行解码。 If you'r only looking for the first of a specific box you can probably ignore that ISOBMFF is a tree structure.如果您只是寻找特定框的第一个,您可能会忽略 ISOBMFF 是一个树结构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM