I couldn't apply the solution to similar questions I found here. After using this in visual code to scrape an web page with python and lxml
[...]
tree = html.fromstring(browser.page_source)
data = tree.xpath('//tr[@title="something"]/td[2]/text()')
if I print(data), I will get this list. Is data a list ?
['\n 1.27\n ', '\n 1.81\n ', '\n 4.90\n ', '\n
2.07\n ', '\n 2.12\n ']
My goal is to extract only the number from each string. I have read about a regex function, not sure if it is the solution
replace($MyString, '[^0-9]', '')
an easy method would be using strip()
. you can scrub the list by doing something like:
clean_data = [d.strip() for d in data]
which will give you:
['1.27', '1.81', '4.90', '2.07', '2.12']
if you want these as actual int
s, just use int(d.strip())
instead
Lets imagine that your output is stored in variable x
:
>>> print("\n".join([y.strip() for y in x]))
1.27
1.81
4.90
2.07
2.12
Would this help? Or you need a list in which case:
>>> print([y.strip() for y in x])
['1.27', '1.81', '4.90', '2.07', '2.12']
[UPDATE]
As for the
Is data a list ?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.