简体   繁体   English

如何在python中使用开始字符和结束字符获取字符串的一部分

[英]How to get a part of a string using a start character and a end character in python

I actually have a string with html. 我实际上有一个带有html的字符串。 And I would like to parse it using xmlparser. 我想使用xmlparser解析它。 The pb is that some tags of my string are not correct. Pb是我的字符串的某些标签不正确。 Especially the <img /> tags. 特别是<img />标签。 So I need to replace those tags because it miss the final / . 因此,我需要替换这些标签,因为它缺少最后的/ I would like to retrieve all img tags and add a / at the end. 我想检索所有img标签并在末尾添加一个/ For that, I need to find all the <img in my text until the next > to replace it by /> in order to parse my string. 为此,我需要在文本中找到所有<img ,直到下一个> ,然后将其替换为/> ,以便解析我的字符串。

Anyone can help me? 有人可以帮助我吗?

Thanks 谢谢

You are asking for all kinds of trouble. 您正在寻求各种麻烦。 Try a library that is better suited to the task. 请尝试一个更适合该任务的库。 It looks like BeautifulSoup may be what you want. 看起来BeautifulSoup可能就是您想要的。

If you are dead set on using xmlparser , then you might want to use BeautifulSoup to clean up the HTML first. 如果您对使用xmlparser ,则可能要先使用BeautifulSoup来清理HTML。 See: How do I fix wrongly nested / unclosed HTML tags? 请参阅: 如何解决嵌套错误/未关闭的HTML标签?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 蟒蛇。 分割一部分以特定字符开头和以字符结尾的字符串 - Python. slice a part of the string that begins with a specific character and end with a character 如何在 python 中替换字符串开头和结尾的字符 - How to replace a character at beginning and end of a string in python 如何在python中获取字符串中的°字符? - How to get ° character in a string in python? 在 Python 中将字符串从 [ 作为起始字符和 ] 作为结束字符用逗号分隔成两个列表 - Splitting string from [ as start character and ] as end character separated by comma into two lists in Python 如何在Python中查找字符的开始和结束事件 - How can I find start and end occurrence of character in Python 如何用字符替换字符串的一部分 - How to replace part of string with a character python list元素以相同字符开头和结尾 - python list elements start with and end with same character 为什么在 Python 中使用 .writelines() 时在字符串末尾添加了一个“i”字符? - Why do i get an added 'i' character at the end of string while using .writelines() in Python ? 如何:在Python中删除特殊字符后的部分Unicode字符串 - How to: remove part of a Unicode string in Python following a special character 如何在某个字符之前获取字符串的最后一部分? - how to get the last part of a string before a certain character?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM