Given an (un)ordered list I have to check if special HTML arrows are being used (and replace them with Latex arrows). lxml.html is a requirement.
I was tinkering around but then I couldn't get past the following:
import lxml.html
my_string = "<li>I have a dream → Hello!</li>"
elem = lxml.html.fromstring(my_string)
if "→" in my_string: # True
print("foo")
if "→" in elem.text: # False
print("bar")
I am unable to understand why the second if-condition evaluates to False. How can I check if (& #8594;) "→" exists in elem.text?
You need to search for a unicode representation of the →
:
>>> s = u"→"
>>> s
u'\u2192'
>>> import lxml.html
>>>
>>> my_string = "<li>I have a dream → Hello!</li>"
>>> elem = lxml.html.fromstring(my_string)
>>>
>>> if u'\u2192' in elem.text:
... print("bar")
...
bar
...and if you're looking to replace the character, import "re" like this:
import re
re.sub(u'\u2192', '→', my_string)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.