[英]New to web scraping in Python and can't figure out my error
import requests
from bs4 import BeautifulSoup
import pandas as pd
ticker = "TSLA"
url = "https://financialmodelingprep.com/financial-summary/" + ticker
request = requests.get(url)
print(request.text)
parser = BeautifulSoup(request.text, "html.parser")
news_html = parser.find_all('a', {'class': 'article-item'})
print(news_html[0])
sentiments = []
for i in range(0, len(news_html)):
sentiments.append(
{
'ticker': ticker,
'date': news_html[i].find('h5', {'class': 'article-date'}).text,
'title': news_html[i].find('h4', {'class': 'article-title'}).text,
'text': news_html[i].find('p', {'class': 'article-text'}).text
}
)
df = pd.DataFrame(sentiments)
df = df.set_index('date')
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Input In [6], in <cell line: 12>()
10 parser = BeautifulSoup(request.text, "html.parser")
11 news_html = parser.find_all('a', {'class': 'article-item'})
---> 12 print(news_html[10])
14 sentiments = []
15 for i in range(0, len(news_html)):
IndexError: list index out of range
I am trying to scrape data for sentiment analysis.我正在尝试抓取数据以进行情绪分析。 There is a second part of the code that is supposed to calculate a sentiment score but I cannot get past the error.代码的第二部分应该计算情绪分数,但我无法克服错误。
import requests
from bs4 import BeautifulSoup
import pandas as pd
ticker = "TSLA"
url = "https://financialmodelingprep.com/financial-summary/" + ticker
request = requests.get(url)
parser = BeautifulSoup(request.text, "html.parser")
news_html = parser.find('div', {'class':'articles'})
sentiments = []
for div in news_html.find_all('a', {'class':'article'}):
try:
sentiments.append(
{
'ticker': ticker,
'date': div.find('h5', {'class': 'article__date'}).text,
'title': div.find('div', {'class': 'article__title'}).text,
'text': div.find('p', {'class': 'article__text'}).text
})
except Exception:
pass
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.