How can I extract the content ( how are you
) from the string:
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">how are you</string>.
Can I use regex for the purpose? if possible whats suitable regex for it.
Note: I dont want to use split function for extract the result. Also can you suggest some links to learn regex for a beginner.
I am using python2.7.2
You could use a regular expression for this ( as Joey demonstrates ).
However if your XML document is any bigger than this one-liner you could not since XML is not a regular language .
>>> from BeautifulSoup import BeautifulSoup
>>> xml_as_str = '<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">how are you</string>. '
>>> soup = BeautifulSoup(xml_as_str)
>>> print soup.text
how are you.
Or...
>>> for string_tag in soup.findAll('string'):
... print string_tag.text
...
how are you
(?<=<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">)[^<]+(?=</string>)
would match what you want, as a trivial example.
(?<=<)[^<]+
would, too. It all depends a bit on how your input is formatted exactly.
尝试使用以下正则表达式:
/<[^>]*>(.*?)</
This will match a generic HTML tag (Replace "string" with the tag you want to match):
/<string[^<]*>(.*?)<\/string>/i
(i=case insensitive)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.