使用Python抓取时出现KeyError

Question

I am scraping flight data off a website using the script below 我正在使用以下脚本从网站上抓取航班数据

import requests
import bs4
import csv

root_url = 'http://www.flightradar24.com/data/flights/3k601/'

response = requests.get(root_url)
soup = bs4.BeautifulSoup(response.text)


try:
    table = soup.find('table')
    rows = table.find_all('tr')
    heads = [i.text.strip() for i in table.select('thead th')]
    for tr in table.select('tbody tr'):
        with open('flight_data.csv', 'a', newline='') as csvfile:
            fieldnames = ['flight', 'From', 'To', 'Date', 'Aircraft', 'STD', 'ATD', 'STA', 'Status'] 
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

        flight_data = {}
        flight_data['flight'] = tr['data-flight-number']
        flight_data['From'] = " ".join(tr.select('td')[1].get_text().split())
        flight_data['To'] = " ".join(tr.select('td')[2].get_text().split())
        flight_data['Date'] = tr['data-date']
        flight_data['Aircraft'] = " ".join(tr.select('td')[3].get_text().split())
        flight_data['STD'] = tr.select('td')[4].get_text()
        flight_data['ATD'] = tr.select('td')[5].get_text()
        flight_data['STA'] = tr.select('td')[6].get_text()
        flight_data['Status'] = " ".join(tr.select('td')[7].get_text().split())
        print (flight_data)
        writer.writerow(flight_data)

except AttributeError as e:
    raise ValueError("No valid table found")

This works if there are data in the table, as for flight pages like this I tried to do a conditional check for the key 如果表中有数据，这将起作用，因为对于这样的排期页面，我试图对键进行条件检查

if tr['data-flight-number'] is None:

but the tr['data-flight-number'] does not exists so how do I go about testing whether the this table is empty or not before I proceed to scrap the data? 但是tr['data-flight-number']不存在，因此在继续抓取数据之前，如何测试该表是否为空？

Answer 1

使用get

if tr.get('somekey', None) is None

使用Python抓取时出现KeyError

问题描述

1 个解决方案

解决方案1
1 2015-04-17 07:16:34

使用Python抓取时出现KeyError

问题描述

1 个解决方案

解决方案1 1 2015-04-17 07:16:34

解决方案1
1 2015-04-17 07:16:34