[英]Get specific value BeautifulSoup (parsing)
I'm trying to extract information from a website.我正在尝试从网站中提取信息。
Using Python ( BeautifulSoup )使用 Python ( BeautifulSoup )
I want to extract the following data ( just the figures )我想提取以下数据(只是数字)
EPS (Basic)每股收益(基本)
from: https://www.marketwatch.com/investing/stock/aapl/financials/income/quarter来自: https://www.marketwatch.com/investing/stock/aapl/financials/income/quarter
From the xml :从xml :
I'm built the code:我构建了代码:
import pandas as pd
from bs4 import BeautifulSoup
import urllib.request as ur
import request
url_is = 'https://www.marketwatch.com/investing/stock/aapl/financials/income/quarter'
read_data = ur.urlopen(url_is).read()
soup_is=BeautifulSoup(read_data, 'lxml')
cells = soup_is.findAll('tr', {'class': 'mainRow'} )
for cell in cells:
print(cell.text)
But I'm not to extract the figures for EPS (Basic)但我不会提取EPS 的数字(基本)
Is there a way to extract just the data and sorted by column?有没有办法只提取数据并按列排序?
Try following css
selector which check td tag contains EPS (Basic)
text.尝试按照
css
选择器检查 td 标签是否包含EPS (Basic)
文本。
import urllib.request as ur
url_is = 'https://www.marketwatch.com/investing/stock/aapl/financials/income/quarter'
read_data = ur.urlopen(url_is).read()
soup_is=BeautifulSoup(read_data, 'lxml')
row = soup_is.select_one('tr.mainRow>td.rowTitle:contains("EPS (Basic)")')
print([cell.text for cell in row.parent.select('td') if cell.text!=''])
Output : Output :
[' EPS (Basic)', '2.47', '2.20', '3.05', '5.04', '2.58']
To print in DF在 DF 中打印
import pandas as pd
from bs4 import BeautifulSoup
import urllib.request as ur
url_is = 'https://www.marketwatch.com/investing/stock/aapl/financials/income/quarter'
read_data = ur.urlopen(url_is).read()
soup_is=BeautifulSoup(read_data, 'lxml')
row = soup_is.select_one('tr.mainRow>td.rowTitle:contains("EPS (Basic)")')
data=[cell.text for cell in row.parent.select('td') if cell.text!='']
df=pd.DataFrame(data)
print(df.T)
Output : Output :
0 1 2 3 4 5
0 EPS (Basic) 2.47 2.20 3.05 5.04 2.58
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.