[英]Extracting numbers from a string with xpath and python 3.6
I couldn't apply the solution to similar questions I found here. 我无法将解决方案应用于在这里找到的类似问题。 After using this in visual code to scrape an web page with python and lxml
在视觉代码中使用此代码通过python和lxml抓取网页后
[...]
tree = html.fromstring(browser.page_source)
data = tree.xpath('//tr[@title="something"]/td[2]/text()')
if I print(data), I will get this list. 如果我打印(数据),我将得到此列表。 Is data a list ?
数据是列表吗?
['\n 1.27\n ', '\n 1.81\n ', '\n 4.90\n ', '\n
2.07\n ', '\n 2.12\n ']
My goal is to extract only the number from each string. 我的目标是从每个字符串中仅提取数字。 I have read about a regex function, not sure if it is the solution
我已经读过一个正则表达式功能,不确定是否可以解决
replace($MyString, '[^0-9]', '')
an easy method would be using strip()
. 一个简单的方法是使用
strip()
。 you can scrub the list by doing something like: 您可以通过执行以下操作来清理列表:
clean_data = [d.strip() for d in data]
which will give you: 这将为您提供:
['1.27', '1.81', '4.90', '2.07', '2.12']
if you want these as actual int
s, just use int(d.strip())
instead 如果您希望将它们作为实际的
int
,则只需使用int(d.strip())
Lets imagine that your output is stored in variable x
: 假设您的输出存储在变量
x
:
>>> print("\n".join([y.strip() for y in x]))
1.27
1.81
4.90
2.07
2.12
Would this help? 这会有所帮助吗? Or you need a list in which case:
或者在这种情况下您需要一个列表:
>>> print([y.strip() for y in x])
['1.27', '1.81', '4.90', '2.07', '2.12']
[UPDATE] [更新]
As for the 至于
Is data a list ?
数据是列表吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.