Hachoir - 从组中检索数据

Question

Trying to use Hachoir to retrieve metadata from a video file. 尝试使用Hachoir从视频文件中检索元数据。 Working reasonably well except when using 'get' or similar to return the width and height values. 工作得相当好，除非使用'get'或类似的方法来返回宽度和高度值。

I assumed it would be: 我以为它会是：

metadata.get('width')

But this throws an error (object does not have 'width' property). 但是这会抛出一个错误（对象没有'width'属性）。

When I run the following: 当我运行以下内容时：

for data in sorted(metadata):
    if len(data.values ) > 0:
        print data.key, data.values[0].value

All that is returned is the information from the "Common" Group. 返回的所有内容都是来自“Common”组的信息。

When I use the: 当我使用：

metadata.exportPlaintext

... the information from "Common", "Video stream" and "Audio stream" is returned. ...返回来自“公共”，“视频流”和“音频流”的信息。 I could simply parse over the resulting 'text' item and strip out the height and width values, but I would rather try to do it properly using metadata.get('width') or similar. 我可以简单地解析生成的“文本”项并去掉高度和宽度值，但我宁愿尝试使用metadata.get（'width'）或类似方法正确地执行它。

Looking at the source code, I thought I could use the following: 看一下源代码，我想我可以使用以下代码：

for key, metadata in metadata.__groups.iteritems():

To iterate through the ._ groups in the metadata, but it then throws a "'AsfMetadata' object has no attribute ' _groups' - which I'm sure shouldn't be the case as I thought 'AsfMetadata' was a subclass of MultipleMetadata() which does have such a variable. 迭代元数据中的._ 组，但它抛出一个“'AsfMetadata'对象没有属性' _groups' - 我肯定不应该这样，因为我认为'AsfMetadata'是MultipleMetadata的子类（）确实有这样一个变量。

Probably missing something quite obvious. 可能缺少一些非常明显的东西。

Answer 1

This seems less straightforward for a WMV file. 对于WMV文件来说，这似乎不那么简单。 I have turned the metadata for such videos into a defaultdict , and it is more straightforward to get the image width now: 我已将此类视频的元数据转换为defaultdict ，现在获取图像宽度更为简单：

from collections import defaultdict
from pprint import pprint

from hachoir_metadata import metadata
from hachoir_core.cmd_line import unicodeFilename
from hachoir_parser import createParser

# using this example http://archive.org/details/WorkToFishtestwmv
filename = './test_wmv.wmv' 
filename, realname = unicodeFilename(filename), filename
parser = createParser(filename)

# See what keys you can extract
for k,v in metadata.extractMetadata(parser)._Metadata__data.iteritems():
    if v.values:
        print v.key, v.values[0].value

# Turn the tags into a defaultdict
metalist = metadata.extractMetadata(parser).exportPlaintext()
meta = defaultdict(defaultdict)
for item in metalist:
    if item.endswith(':'):
        k = item[:-1]
    else:
        tag, value = item.split(': ')
        tag = tag[2:]
        meta[k][tag] = value

print meta['Video stream #1']['Image width'] # 320 pixels

Answer 2

To get width x height from the first top-level metadata group that has the size info in the media file without accessing private attributes and without parsing the text output, you could use file_metadata.iterGroups() : 要从媒体文件中具有大小信息的第一个顶级元数据组获取width x height而不访问私有属性并且不解析文本输出，可以使用file_metadata.iterGroups() ：

#!/usr/bin/env python
import sys
from itertools import chain

# $ pip install hachoir-{core,parser,metadata}
from hachoir_core.cmd_line import unicodeFilename
from hachoir_metadata import extractMetadata
from hachoir_parser import createParser

file_metadata = extractMetadata(createParser(unicodeFilename(sys.argv[1])))
it = chain([file_metadata], file_metadata.iterGroups())
print("%sx%s" % next((metadata.get('width'), metadata.get('height'))
                     for metadata in it
                     if metadata.has('width') and metadata.get('height')))

To convert metadata into a dictionary (non-recursively, ie, iterate groups manually if needed): 要将metadata转换为字典（非递归，即根据需要手动迭代组）：

def metadata_as_dict(metadata):
    return {item.key: (len(item.values) > 1 and 
                       [v.value for v in item.values] or
                       item.values[0].value)
            for item in metadata if item.values}

Hachoir - 从组中检索数据

问题描述

2 个解决方案

解决方案1
4 已采纳 2013-01-27 13:00:14

解决方案2
4 2014-10-13 22:51:27

Hachoir - 从组中检索数据

问题描述

2 个解决方案

解决方案1 4 已采纳 2013-01-27 13:00:14

解决方案2 4 2014-10-13 22:51:27

解决方案1
4 已采纳 2013-01-27 13:00:14

解决方案2
4 2014-10-13 22:51:27