简体   繁体   English

Python - 使用 lxml 查找特定的 XML 元素

[英]Python - Find specific XML element using lxml

Background: I understand this question has been asked, but I'm really struggling with finding the right Xpath formula.背景:我知道有人问过这个问题,但我真的很难找到正确的 Xpath 公式。 I'm working with the iTunes XML file.我正在使用 iTunes XML 文件。 So Apple is really annoying how they formatted this file.... Instead of making a tag, and then the text being the ID, they only using, , <some_value> tags.所以Apple真的很烦他们如何格式化这个文件......而不是制作一个标签,然后将文本作为ID,他们只使用, , <some_value> 标签。 This is making it REALLY confusing trying to find the right element.这使得试图找到正确的元素变得非常混乱。

I've been trying to following this question: Python ElementTree: find element by its child's text using XPath but it's just not working for me.我一直在尝试关注这个问题: Python ElementTree: find element by its child's text using XPath但它对我不起作用。 I've been reading this ( https://lxml.de/tutorial.html ) document, but i'm not finding the answer I need.我一直在阅读这个( https://lxml.de/tutorial.html )文档,但我没有找到我需要的答案。 I'm fairly certain the Xpath is the way to go, but I'll take any better suggestions.我相当肯定 Xpath 是通往 go 的方法,但我会接受任何更好的建议。

I did try using some itunes specific libraries.我确实尝试过使用一些 iTunes 特定的库。 It seems like they are all out of date, or just not working how I need.似乎它们都过时了,或者只是没有按照我的需要工作。 I was initially using Elementtree, but the getnext() feature from lxml is a savior when it Apple formats the file this way.我最初使用的是 Elementtree,但是当 Apple 以这种方式格式化文件时,lxml 的 getnext() 功能是一个救星。

Example XML file:示例 XML 文件:

<plist version="1.0">
<dict>
    <key>Library Persistent ID</key><string>6948B4402F0EEFFF</string>
    <key>Tracks</key>
    <dict>
        <key>18051</key>
        <dict>
            <key>Track ID</key><integer>18051</integer>
            <key>Size</key><integer>7930116</integer>
            <key>Total Time</key><integer>196336</integer>
            <key>BPM</key><integer>86</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:05Z</date>
            <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>
        <key>18053</key>
        <dict>
            <key>Track ID</key><integer>18053</integer>
            <key>Size</key><integer>9780560</integer>
            <key>Total Time</key><integer>243513</integer>
            <key>Year</key><integer>2010</integer>
            <key>BPM</key><integer>74</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:09Z</date>
            <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>
        <key>18055</key>
        <dict>
            <key>Track ID</key><integer>18055</integer>
            <key>Size</key><integer>12995663</integer>
            <key>Total Time</key><integer>323604</integer>
            <key>Year</key><integer>2005</integer>
            <key>BPM</key><integer>76</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:14Z</date>
        <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>

The Approach: So let's say I need to find the element that has the ID '18053', Instead of iterating over everything recursively (which I've figured out how to do), it would be much more efficient if I could check for the ID #.方法:假设我需要找到 ID 为“18053”的元素,而不是递归地迭代所有内容(我已经知道如何去做),如果我可以检查ID #。 Then get the element of the that's after然后得到之后的元素

I've tried the following:我尝试了以下方法:

root = etree.parse(xml_file_path)
key_found = root.xpath("//key[text()='18053']")
element_wanted = key_found.getnext()

But get an error because key_found is a list, not an element.但是得到一个错误,因为 key_found 是一个列表,而不是一个元素。

Using find, should return an element, but using the following:使用 find 应该返回一个元素,但使用以下内容:

key_found = root.find("//key[text()='18053']")

I get an error "bad predicate"我收到一个错误“坏谓词”

Any help is appreciated.任何帮助表示赞赏。 I've been working on this one for a few days now.我已经为此工作了几天了。 Blame Apple!怪苹果! Thanks!谢谢!

The xpath method should return a list, so xpath方法应该返回一个列表,所以

element_wanted = key_found.getnext()

would be将会

element_wanted = key_found[0].getnext()

Provided you also test if the list contains elements如果您还测试列表是否包含元素

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM