Python lxml如何使用XPath重新发布处理XPath结果

Question

I have the following XML file: 我有以下XML文件：

<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform(version=1.7.8) -->
<MPD
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:mpeg:dash:schema:mpd:2011"
        xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
        type="static"
        mediaPresentationDuration="PT1H43M36.832S"
        maxSegmentDuration="PT3S"
        minBufferTime="PT10S"
        profiles="urn:mpeg:dash:profile:isoff-live:2011,urn:com:dashif:dash264">
    <Period>
        <BaseURL>dash/</BaseURL>
        <AdaptationSet group="1" contentType="audio" lang="tr" minBandwidth="157405" maxBandwidth="157405"
                       segmentAlignment="true" audioSamplingRate="48000" mimeType="audio/mp4" codecs="mp4a.40.2">
            <AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2">
            </AudioChannelConfiguration>
            <Representation id="audio_tur=157405" bandwidth="157405">
            </Representation>
        </AdaptationSet>
        <AdaptationSet group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000"
                       minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true"
                       frameRate="25" mimeType="video/mp4" startWithSAP="1">
            <Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3"
                            codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3"
                            codecs="avc1.4D4028" scanType="progressive">
            </Representation>
            <Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028"
                            scanType="progressive">
            </Representation>
        </AdaptationSet>
        <AdaptationSet
                group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000"
                minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33"
                frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
            <Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
            </Representation>
        </AdaptationSet>
    </Period>
</MPD>

And run the following Python code on it: 并在其上运行以下Python代码：

from lxml import etree

file = "Data.xml"
namespaces = {'ns':'urn:mpeg:dash:schema:mpd:2011'}

tree = etree.parse(file)
root = tree.getroot()

for r in root.xpath('//ns:AdaptationSet[@contentType="video"]',namespaces=namespaces):
    print etree.tostring(r)
    for bandwidth in r.xpath('//ns:Representation/@bandwidth',namespaces=namespaces):
        print bandwidth

My problem is now that the second loop is not using the result from the xpath before but again the complete tree! 我的问题是，第二个循环没有使用xpath之前的结果，而是再次使用了完整的树！ That's why the results include representations for audio as well. 这就是为什么结果还包括音频表示的原因。 In detail it looks like the following: 详细地，它看起来如下所示：

<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000" minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true" frameRate="25" mimeType="video/mp4" startWithSAP="1">
            <Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E" scanType="progressive">
            </Representation>
            <Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E" scanType="progressive">
            </Representation>
            <Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3" codecs="avc1.4D4028" scanType="progressive">
            </Representation>
            <Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028" scanType="progressive">
            </Representation>
        </AdaptationSet>

157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000" minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33" frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
            <Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
            </Representation>
        </AdaptationSet>

157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000

So even the right AdaptionSet's are found, for both iterations the complete tree is processed. 因此，即使找到了正确的AdaptionSet，对于两次迭代而言，完整的树也会被处理。 I know I could build a XPath just getting the bandwith, but I need the AdaptionSet before and would love to use only the result from the first loop for the second. 我知道我可以建立一个XPath来获取带宽，但是我之前需要AdaptionSet，并且很乐意仅将第一个循环的结果用于第二个循环。 How can I do this? 我怎样才能做到这一点？

Answer 1

You got to add . 您必须添加. at the beginning of your XPath to make it relative to current context node, which in this case referenced by the variable r : 在XPath的开头，使其相对于当前上下文节点，在这种情况下，该节点由变量r引用：

r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces)

This behavior is mentioned in the XPath 1.0 documentation as follow : XPath 1.0文档中提到了此行为，如下所示：

//para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node //para选择文档根目录的所有para后代，并因此选择同一文档中的所有para元素作为上下文节点

.//para selects the para element descendants of the context node .//para选择上下文节点的para元素后代

Answer 2

Try using relative xpath - 尝试使用相对xpath-

for bandwidth in r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces):

. would make the xpath start at the current element. 将使xpath从当前元素开始。 If the . 如果. is not specified, as you observed, the xpath would query from the root node. 如您所观察到的那样，未指定xpath将从根节点查询。

Python lxml如何使用XPath重新发布处理XPath结果

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-07-29 08:08:30

解决方案2
1 2015-07-29 08:08:40

Python lxml如何使用XPath重新发布处理XPath结果

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-07-29 08:08:30

解决方案2 1 2015-07-29 08:08:40

解决方案1
1 已采纳 2015-07-29 08:08:30

解决方案2
1 2015-07-29 08:08:40