使用正则表达式修改XML

Question

I have to replace some 'file names' from within an XML with the full path for those files. 我必须用这些文件的完整路径替换XML中的一些“文件名”。 All the files are in the same directory which simplifies things somewhat. 所有文件都在同一目录中，这在某种程度上简化了事情。 I was trying to use BeautifulSoup4 but there was a bug that kept breaking crashing python so I'm trying to do the same thing with regex's. 我试图使用BeautifulSoup4，但是有一个bug不断破坏崩溃的python，所以我试图对regex做同样的事情。

The copasiML variable contains an XML as a string. copasiML变量包含XML作为字符串。

My code: 我的代码：

    copasiML=IA.read_copasiML_as_string(copasi_file)
    data_file_names=re.findall('<Parameter name="File Name" type="file" value="(.*)"/>',copasiML)
    for i in data_file_names:
        copasiML2=re.sub('<Parameter name="File Name" type="file" value="'+i+'"/>','<Parameter name="File Name" type="file" value="'+os.path.join(os.getcwd()+i)+'"/>',copasiML)
        os.remove(copasi_file)
        with open(copasi_file,'w') as f:
            f.write(str(copasiML2))

As it stands, my code runs but doesn't actually do anything. 就目前而言，我的代码可以运行，但是实际上什么也没做。 Would anybody happen to know how to fix my code? 有人会知道如何解决我的代码吗？

Many thanks 非常感谢

Answer 1

Every time you try to parse HTML/XML with Regular Expressions, you make baby Jesus cry. 每次您尝试使用正则表达式解析HTML / XML时，您都会使耶稣耶稣哭泣。

I BeautifulSoup doesn't work for you, I suggest XPath as illustrated in python lxml - modify attributes . 我BeautifulSoup不适用于您，我建议使用python lxml中所示的XPath- Modify attributes 。

Fun (but instructive) readings: 有趣（但很有启发性）的读物：

RegEx match open tags except XHTML self-contained tags RegEx匹配XHTML自包含标签以外的打开标签
Using regular expressions to parse HTML: why not? 使用正则表达式解析HTML：为什么不呢？
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms 为什么无法使用正则表达式解析HTML / XML：用外行的术语进行的正式解释
https://news.ycombinator.com/item?id=2741780 https://news.ycombinator.com/item?id=2741780

使用正则表达式修改XML

问题描述

1 个解决方案

解决方案1
0 2015-06-26 14:12:33

使用正则表达式修改XML

问题描述

1 个解决方案

解决方案1 0 2015-06-26 14:12:33

解决方案1
0 2015-06-26 14:12:33