简体   繁体   中英

I have written a news web scraper code the error is IndexError: list index out of range and its not resolving

import requests
from bs4 import BeautifulSoup
import pandas as pd

headers = {
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
}

if __name__ == '__main__':
    # Access the page
    r = requests.get('https://tribune.com.pk/kse-100-index/')

    if r.status_code == 200:
        html = r.text
        soup = BeautifulSoup(html, 'lxml')

    news_date = soup.find_all("span", {"class": "excerpt"})

    news_title = soup.find_all("h2", {"class": "title"})
    # news_title = soup.find_all("div",{"class":"story  cat-0 group-0 position-14 sub-story clearfix"})

    news_description = soup.find_all("p", {"class": "excerpt"})
    date_list = []
    title_list = []
    description_list = []
    # print("Hello world")
    for date in news_date:
        date_list.append(date.text)
    for title in news_title:
        title_list.append(title.text)
    for description in news_description:
        description_list.append(description.text)
    for i in range(30):
        print("Date: ", date_list[i])
        print("Title: ", title_list[i])
        print("Description: ", description_list[i])
        print()

    s1 = pd.Series(date_list[0:30], name='News Date')
    s2 = pd.Series(title_list[0:30], name='News Heading')
    s3 = pd.Series(description_list[0:30], name='News Description')
    df = pd.concat([s2, s3], axis=1)
    df = df.fillna("None")
    # d = {'Phone name': phone_name_list, 'Price per month': price_per_month_list,'Interest List': interest_list, 'Total Price List': total_price_list,'Unlimited Offer List': unlimited_offer_list}
    # df = pd.DataFrame(data=d)
    print(df)
    df.to_csv('tribune1.csv', encoding='utf-8')

It's an Express Tribune web scraper . I want to print the date, headlines and description . When I was running the code previously, there wasn't any error. But after adding the date section, the error appeared.

I would be thankful to you if you could help me resolve the error.

Change this in your code.

if(len(date_list) > 0 ):
    for date in news_date:
        date_list.append(date.text)
    for title in news_title:
        title_list.append(title.text)
    for description in news_description:
        description_list.append(description.text)
    for i in range(max(len(date_list), 30)): # So as to have 30 or less items
        print("Date: ", date_list[i])
        print("Title: ", title_list[i])
        print("Description: ", description_list[i])
        print()
 else:
     print("Not found");

Make sure you have proper indentation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM