繁体   English   中英

使用 xml.dom.minidom 获取标签的属性列表?

[英]Get list of attributes for a tag using xml.dom.minidom?

我正在尝试使用 minidom 解析 svg 文件字符串并提取所有标签。 这没有问题。 我现在要做的是获取路径标记包含的所有属性的列表。 我可以使用正则表达式轻松制作自己的解析器,但我想使用比我自己的意大利面条代码更可靠的东西。 我尝试做path._get_attributes()但这会返回一个 KeyError。 到目前为止,这是我的代码。

from xml.dom import minidom

svg_string = '''<?xml version='1.0' encoding='iso-8859-1'?>
<svg version='1.1' baseProfile='full'
              xmlns='http://www.w3.org/2000/svg'
                      xmlns:rdkit='http://www.rdkit.org/xml'
                      xmlns:xlink='http://www.w3.org/1999/xlink'
                  xml:space='preserve'
width='300px' height='300px' viewBox='0 0 300 300'>
<!-- END OF HEADER -->
<rect style='opacity:1.0;fill:#FFFFFF;stroke:none' width='300.0' height='300.0' x='0.0' y='0.0'> </rect>
<path class='bond-0 atom-0 atom-1' d='M 49.1,144.6 L 71.8,157.7' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-0 atom-0 atom-1' d='M 71.8,157.7 L 94.5,170.8' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-1 atom-1 atom-2' d='M 94.5,170.8 L 150.5,138.5' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-2 atom-2 atom-3' d='M 150.5,138.5 L 206.4,170.8' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3 atom-3 atom-4' d='M 206.4,170.8 L 229.1,157.7' style='fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3 atom-3 atom-4' d='M 229.1,157.7 L 251.8,144.6' style='fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='atom-0' d='M 13.6 129.4
L 16.1 129.4
L 16.1 137.2
L 25.5 137.2
L 25.5 129.4
L 28.0 129.4
L 28.0 147.7
L 25.5 147.7
L 25.5 139.3
L 16.1 139.3
L 16.1 147.7
L 13.6 147.7
L 13.6 129.4
' fill='#FF0000'/>
<path class='atom-0' d='M 30.2 138.5
Q 30.2 134.1, 32.3 131.7
Q 34.5 129.2, 38.5 129.2
Q 42.6 129.2, 44.8 131.7
Q 46.9 134.1, 46.9 138.5
Q 46.9 143.0, 44.8 145.5
Q 42.6 148.0, 38.5 148.0
Q 34.5 148.0, 32.3 145.5
Q 30.2 143.0, 30.2 138.5
M 38.5 145.9
Q 41.3 145.9, 42.8 144.1
Q 44.4 142.2, 44.4 138.5
Q 44.4 134.9, 42.8 133.1
Q 41.3 131.3, 38.5 131.3
Q 35.8 131.3, 34.2 133.1
Q 32.7 134.9, 32.7 138.5
Q 32.7 142.2, 34.2 144.1
Q 35.8 145.9, 38.5 145.9
' fill='#FF0000'/>
<path class='atom-4' d='M 254.0 138.5
Q 254.0 134.1, 256.1 131.7
Q 258.3 129.2, 262.4 129.2
Q 266.4 129.2, 268.6 131.7
Q 270.8 134.1, 270.8 138.5
Q 270.8 143.0, 268.6 145.5
Q 266.4 148.0, 262.4 148.0
Q 258.3 148.0, 256.1 145.5
Q 254.0 143.0, 254.0 138.5
M 262.4 145.9
Q 265.1 145.9, 266.6 144.1
Q 268.2 142.2, 268.2 138.5
Q 268.2 134.9, 266.6 133.1
Q 265.1 131.3, 262.4 131.3
Q 259.6 131.3, 258.0 133.1
Q 256.5 134.9, 256.5 138.5
Q 256.5 142.2, 258.0 144.1
Q 259.6 145.9, 262.4 145.9
' fill='#FF0000'/>
<path class='atom-4' d='M 272.0 129.4
L 274.5 129.4
L 274.5 137.2
L 283.9 137.2
L 283.9 129.4
L 286.4 129.4
L 286.4 147.7
L 283.9 147.7
L 283.9 139.3
L 274.5 139.3
L 274.5 147.7
L 272.0 147.7
L 272.0 129.4
' fill='#FF0000'/>
</svg>'''

def parse_svg(svg_string):

    '''Gets all the paths and their attributes form an svg string.'''

    doc = minidom.parseString(svg_string)  
    paths = [path for path in doc.getElementsByTagName('path')]
    # this is where I want to make a list comprehension to get a list of attributes 
    # for all the attributes a path contains.

    doc.unlink()


parse_svg(svg_string)

您可以通过看起来像 map/dict 的“属性”访问属性,并使用“items()”方法获取所有键值。 因此,您的代码可能应如下所示:

def parse_svg(svg_string):
    '''Gets all the paths and their attributes form an svg string.'''

    doc = minidom.parseString(svg_string)
    paths = [path for path in doc.getElementsByTagName('path')]
    for path in paths:
        # all the attributes of the path
        attrs = dict(path.attributes.items())
        print(attrs) # do whatever you want about the attrs, here I just print

    doc.unlink()

打印结果如下所示:

{'class': 'bond-0 atom-0 atom-1', 'd': 'M 49.1,144.6 L 71.8,157.7', 'style': 'fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'bond-0 atom-0 atom-1', 'd': 'M 71.8,157.7 L 94.5,170.8', 'style': 'fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'bond-1 atom-1 atom-2', 'd': 'M 94.5,170.8 L 150.5,138.5', 'style': 'fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'bond-2 atom-2 atom-3', 'd': 'M 150.5,138.5 L 206.4,170.8', 'style': 'fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'bond-3 atom-3 atom-4', 'd': 'M 206.4,170.8 L 229.1,157.7', 'style': 'fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'bond-3 atom-3 atom-4', 'd': 'M 229.1,157.7 L 251.8,144.6', 'style': 'fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1'}
{'class': 'atom-0', 'd': 'M 13.6 129.4 L 16.1 129.4 L 16.1 137.2 L 25.5 137.2 L 25.5 129.4 L 28.0 129.4 L 28.0 147.7 L 25.5 147.7 L 25.5 139.3 L 16.1 139.3 L 16.1 147.7 L 13.6 147.7 L 13.6 129.4 ', 'fill': '#FF0000'}
{'class': 'atom-0', 'd': 'M 30.2 138.5 Q 30.2 134.1, 32.3 131.7 Q 34.5 129.2, 38.5 129.2 Q 42.6 129.2, 44.8 131.7 Q 46.9 134.1, 46.9 138.5 Q 46.9 143.0, 44.8 145.5 Q 42.6 148.0, 38.5 148.0 Q 34.5 148.0, 32.3 145.5 Q 30.2 143.0, 30.2 138.5 M 38.5 145.9 Q 41.3 145.9, 42.8 144.1 Q 44.4 142.2, 44.4 138.5 Q 44.4 134.9, 42.8 133.1 Q 41.3 131.3, 38.5 131.3 Q 35.8 131.3, 34.2 133.1 Q 32.7 134.9, 32.7 138.5 Q 32.7 142.2, 34.2 144.1 Q 35.8 145.9, 38.5 145.9 ', 'fill': '#FF0000'}
{'class': 'atom-4', 'd': 'M 254.0 138.5 Q 254.0 134.1, 256.1 131.7 Q 258.3 129.2, 262.4 129.2 Q 266.4 129.2, 268.6 131.7 Q 270.8 134.1, 270.8 138.5 Q 270.8 143.0, 268.6 145.5 Q 266.4 148.0, 262.4 148.0 Q 258.3 148.0, 256.1 145.5 Q 254.0 143.0, 254.0 138.5 M 262.4 145.9 Q 265.1 145.9, 266.6 144.1 Q 268.2 142.2, 268.2 138.5 Q 268.2 134.9, 266.6 133.1 Q 265.1 131.3, 262.4 131.3 Q 259.6 131.3, 258.0 133.1 Q 256.5 134.9, 256.5 138.5 Q 256.5 142.2, 258.0 144.1 Q 259.6 145.9, 262.4 145.9 ', 'fill': '#FF0000'}
{'class': 'atom-4', 'd': 'M 272.0 129.4 L 274.5 129.4 L 274.5 137.2 L 283.9 137.2 L 283.9 129.4 L 286.4 129.4 L 286.4 147.7 L 283.9 147.7 L 283.9 139.3 L 274.5 139.3 L 274.5 147.7 L 272.0 147.7 L 272.0 129.4 ', 'fill': '#FF0000'}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM