不区分大小写的xml和python

Question

I got this piece of code and I am trying to read all the 'ref' 'href' tags. 我得到了这段代码，我正在尝试阅读所有'ref''href'标签。 I am not sure how to make this to be case insensitive as some of my xml files have REF or Ref or ref. 我不知道如何使这个不区分大小写，因为我的一些xml文件有REF或Ref或ref。 Any suggestions? 有什么建议么？

    f = urllib.urlopen(url)
    tree = ET.parse(f)
    root = tree.getroot()

    for child in root.iter('ref'):
      t = child.get('href')
      if t not in self.href:
        self.href.append(t)
        print self.href[-1]

Answer 1

You can normalize tags and attributes by converting them to lowercase using the functions below as a step of preprocessing: 您可以使用以下函数将标记和属性转换为小写来规范化标记和属性，作为预处理的一个步骤：

import xml.etree.ElementTree as ET
f = urllib.urlopen(url)
tree = ET.parse(f)
root = tree.getroot()

def normalize_tags(root):
    root.tag = root.tag.lower()
    for child in root:
        normalize_tags(child)

def normalize_attr(root):
    for attr,value in root.attrib.items():
        norm_attr = attr.lower()
        if norm_attr != attr:
            root.set(norm_attr,value)
            root.attrib.pop(attr)

    for child in root:
        normalize_attr(child)


normalize_tags(root)    
normalize_attr(root)
print(ET.tostring(root))

Answer 2

The following should help 以下应该有所帮助

f = urllib.urlopen(url)
tree = ET.parse(f)
root = tree.getroot()

for child in root:
  if child.tag.lower() == 'ref':
    t = child.attribute.get('href')
    if t not in self.href:
      self.href.append(t)
      print self.href[-1]

Answer 3

If you are using lxml then one option is to use XPath with regular expressions through XSLT extensions ( https://stackoverflow.com/a/2756994/2997179 ): 如果您使用的是lxml那么一个选项是通过XSLT扩展（ https://stackoverflow.com/a/2756994/2997179 ）将XPath与正则表达式一起使用：

root.xpath("./*[re:test(local-name(), '(?i)href')]",
    namespaces={"re": "http://exslt.org/regular-expressions"})

不区分大小写的xml和python

问题描述

3 个解决方案

解决方案1
3 2016-03-01 12:55:16

解决方案2
0 2016-03-01 11:54:16

解决方案3
0 2016-03-01 12:16:34

不区分大小写的xml和python

问题描述

3 个解决方案

解决方案1 3 2016-03-01 12:55:16

解决方案2 0 2016-03-01 11:54:16

解决方案3 0 2016-03-01 12:16:34

解决方案1
3 2016-03-01 12:55:16

解决方案2
0 2016-03-01 11:54:16

解决方案3
0 2016-03-01 12:16:34