简体   繁体   English

使用正则表达式修改XML

[英]Using regex to modify XML

I have to replace some 'file names' from within an XML with the full path for those files. 我必须用这些文件的完整路径替换XML中的一些“文件名”。 All the files are in the same directory which simplifies things somewhat. 所有文件都在同一目录中,这在某种程度上简化了事情。 I was trying to use BeautifulSoup4 but there was a bug that kept breaking crashing python so I'm trying to do the same thing with regex's. 我试图使用BeautifulSoup4,但是有一个bug不断破坏崩溃的python,所以我试图对regex做同样的事情。

The copasiML variable contains an XML as a string. copasiML变量包含XML作为字符串。

My code: 我的代码:

    copasiML=IA.read_copasiML_as_string(copasi_file)
    data_file_names=re.findall('<Parameter name="File Name" type="file" value="(.*)"/>',copasiML)
    for i in data_file_names:
        copasiML2=re.sub('<Parameter name="File Name" type="file" value="'+i+'"/>','<Parameter name="File Name" type="file" value="'+os.path.join(os.getcwd()+i)+'"/>',copasiML)
        os.remove(copasi_file)
        with open(copasi_file,'w') as f:
            f.write(str(copasiML2))

As it stands, my code runs but doesn't actually do anything. 就目前而言,我的代码可以运行,但是实际上什么也没做。 Would anybody happen to know how to fix my code? 有人会知道如何解决我的代码吗?

Many thanks 非常感谢

Every time you try to parse HTML/XML with Regular Expressions, you make baby Jesus cry. 每次您尝试使用正则表达式解析HTML / XML时,您都会使耶稣耶稣哭泣。

I BeautifulSoup doesn't work for you, I suggest XPath as illustrated in python lxml - modify attributes . 我BeautifulSoup不适用于您,我建议使用python lxml中所示的XPath- Modify attributes

Fun (but instructive) readings: 有趣(但很有启发性)的读物:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM