web 在 Python 中抓取的新手，无法弄清楚我的错误

Question

import requests
from bs4 import BeautifulSoup 
import pandas as pd

ticker = "TSLA"
url = "https://financialmodelingprep.com/financial-summary/" + ticker
request = requests.get(url)
print(request.text)

parser = BeautifulSoup(request.text, "html.parser")
news_html = parser.find_all('a', {'class': 'article-item'})
print(news_html[0])

sentiments = []
for i in range(0, len(news_html)):
    sentiments.append(
            {
                'ticker': ticker,
                'date': news_html[i].find('h5', {'class': 'article-date'}).text,
                'title': news_html[i].find('h4', {'class': 'article-title'}).text,
                'text': news_html[i].find('p', {'class': 'article-text'}).text
            }
        )

df = pd.DataFrame(sentiments)
df = df.set_index('date')

---------------------------------------------------------------------------

    IndexError                                Traceback (most recent call last)
    Input In [6], in <cell line: 12>()
         10 parser = BeautifulSoup(request.text, "html.parser")
         11 news_html = parser.find_all('a', {'class': 'article-item'})
    ---> 12 print(news_html[10])
         14 sentiments = []
         15 for i in range(0, len(news_html)):
    
    IndexError: list index out of range

I am trying to scrape data for sentiment analysis.我正在尝试抓取数据以进行情绪分析。 There is a second part of the code that is supposed to calculate a sentiment score but I cannot get past the error.代码的第二部分应该计算情绪分数，但我无法克服错误。

Answer 1

import requests
from bs4 import BeautifulSoup 
import pandas as pd

ticker = "TSLA"
url = "https://financialmodelingprep.com/financial-summary/" + ticker
request = requests.get(url)

parser = BeautifulSoup(request.text, "html.parser")
news_html = parser.find('div', {'class':'articles'})

sentiments = []
for div in news_html.find_all('a', {'class':'article'}):
    try:
        sentiments.append(
                {
                    'ticker': ticker,
                    'date': div.find('h5', {'class': 'article__date'}).text,
                    'title': div.find('div', {'class': 'article__title'}).text,
                    'text': div.find('p', {'class': 'article__text'}).text
                })
        
    except Exception:
        pass

web 在 Python 中抓取的新手，无法弄清楚我的错误

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-09-10 06:21:23

web 在 Python 中抓取的新手，无法弄清楚我的错误

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-09-10 06:21:23

解决方案1
2 已采纳 2022-09-10 06:21:23