[英]How to get a part of a string using a start character and a end character in python
I actually have a string with html. 我实际上有一个带有html的字符串。 And I would like to parse it using xmlparser. 我想使用xmlparser解析它。 The pb is that some tags of my string are not correct. Pb是我的字符串的某些标签不正确。 Especially the <img />
tags. 特别是<img />
标签。 So I need to replace those tags because it miss the final /
. 因此,我需要替换这些标签,因为它缺少最后的/
。 I would like to retrieve all img
tags and add a /
at the end. 我想检索所有img
标签并在末尾添加一个/
。 For that, I need to find all the <img
in my text until the next >
to replace it by />
in order to parse my string. 为此,我需要在文本中找到所有<img
,直到下一个>
,然后将其替换为/>
,以便解析我的字符串。
Anyone can help me? 有人可以帮助我吗?
Thanks 谢谢
You are asking for all kinds of trouble. 您正在寻求各种麻烦。 Try a library that is better suited to the task. 请尝试一个更适合该任务的库。 It looks like BeautifulSoup
may be what you want. 看起来BeautifulSoup
可能就是您想要的。
If you are dead set on using xmlparser
, then you might want to use BeautifulSoup
to clean up the HTML first. 如果您对使用xmlparser
,则可能要先使用BeautifulSoup
来清理HTML。 See: How do I fix wrongly nested / unclosed HTML tags? 请参阅: 如何解决嵌套错误/未关闭的HTML标签?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.