[英]How do you extract the floats from the elements in a python list?
I am using BeautifulSoup4 to build a script that does financial calculations. 我正在使用BeautifulSoup4来构建执行财务计算的脚本。 I have successfully extracted data to a list, but only need the float numbers from the elements.
我已成功将数据提取到列表中,但只需要元素中的浮点数即可。
For Example: 例如:
Volume = soup.find_all('td', {'class':'text-success'})
print (Volume)
This gives me the list output of: 这给了我清单输出:
[<td class="text-success">+1.3 LTC</td>, <td class="text- success">+5.49<span class="muteds">340788</span> LTC</td>, <td class="text-success">+1.3 LTC</td>,]
I want it to become: 我希望它成为:
[1.3, 5.49, 1.3]
How can I do this? 我怎样才能做到这一点?
Thank-you so much for reading my post I greatly appreciate any help I can get. 非常感谢您阅读我的文章,我非常感谢我能获得的任何帮助。
You can find the first text node inside every td
, split it by space, get the first item and convert it to float
via float()
- the +
would be handled automatically: 您可以在每个
td
找到第一个文本节点,将其按空格分割,获取第一个项目,然后通过float()
将其转换为float
+
将自动处理:
from bs4 import BeautifulSoup
data = """
<table>
<tr>
<td class="text-success">+1.3 LTC</td>
<td class="text-success">+5.49<span class="muteds">340788</span> LTC</td>
<td class="text-success">+1.3 LTC</td>
</tr>
</table>"""
soup = BeautifulSoup(data, "html.parser")
print([
float(td.find(text=True).split(" ", 1)[0])
for td in soup.find_all('td', {'class':'text-success'})
])
Prints [1.3, 5.49, 1.3]
. 打印
[1.3, 5.49, 1.3]
。
Note how the find(text=True)
helps to avoid extracting the 340788
in the second td
. 注意
find(text=True)
如何避免在第二个td
提取340788
。
You can do 你可以做
>>> import re
>>> re.findall("\d+\.\d+", yourString)
['1.3', '5.49', '1.3']
>>>
Then to convert to floats 然后转换为浮点数
>>> [float(x) for x in re.findall("\d+\.\d+", yourString)]
[1.3, 5.49, 1.3]
>>>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.