简体   繁体   English

python 页面中的刮字体真棒图标(fa-fa 图标)

[英]python scrapying font awesome icons(fa-fa icons) in the page

I'm trying to extract data in a website that have some font awesome icons like this one <i class="fa fa-check-square green-icon font-095"></i>我正在尝试在一个网站中提取数据,该网站有一些像这样的字体很棒的图标<i class="fa fa-check-square green-icon font-095"></i>

mostly there're two types of icons meaning "correct" or "wrong" , I wanna to extract this as 1 and 0(if correct 1 else 0)主要有两种类型的图标表示“正确”“错误” ,我想将其提取为 1 和 0(如果正确 1 则为 0)

are there any suggestions of how I can extract this type of data?关于如何提取此类数据有什么建议吗?

In terms of extracting this data you can use the BeautifulSoup and Requests libraries.在提取这些数据方面,您可以使用 BeautifulSoup 和 Requests 库。 It would look something like this...它看起来像这样......

import requests
from bs4 import BeautifulSoup
r = requests.get("www.website-you-want.com")
soup = BeautifulSoup(r.text, 'lxml')
rows = soup.find_all('i')

This should get you every occurrence of the i tag on the page.这应该让您在页面上每次出现 i 标记。 If you wanted to be more specific you could do something along the lines of...如果你想更具体,你可以做一些沿着......

rows = soup.find_all('i', {'class', 'green-icon'})

This should get you every occurrence of the i tag with the green-icon class.这应该让您每次出现带有绿色图标 class 的 i 标记。

NOTE: If the website is dynamically loading content, you will have to use selenium with beautiful soup.注意:如果网站动态加载内容,您将不得不使用 selenium 和漂亮的汤。 Let me know if that's the case and I can try to help with that.如果是这种情况,请告诉我,我可以尝试提供帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM