简体   繁体   English

Python Web 抓取错误 - 从 JSON 读取 - IndexError:列表索引超出范围 - 我该如何忽略

[英]Python Web Scraping error - Reading from JSON- IndexError: list index out of range - how do I ignore

I am performing web scraping via Python \ Selenium \ Chrome headless driver.我正在通过 Python \ Selenium \ Chrome 无头驱动程序执行 web 抓取。 I am reading the results from JSON - here is my code:我正在阅读 JSON 的结果——这是我的代码:

CustId=500
while (CustId<=510):
  
  print(CustId)

  # Part 1: Customer REST call:
  urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
  driver.get(urlg)

  soup = BeautifulSoup(driver.page_source,"lxml")

  dict_from_json = json.loads(soup.find("body").text)
  # print(dict_from_json)

  #try:
 
  CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])

  # Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]['addressDisplayName'])

  writefunction()

  CustId = CustId+1

The issue is sometimes 'addressDisplayName' will be present in the result set and sometimes not.问题是有时“addressDisplayName”会出现在结果集中,有时不会。 If its not, it errors with the error:如果不是,它会出现以下错误:

IndexError: list index out of range

Which makes sense, as it doesn't exist.这是有道理的,因为它不存在。 How do I ignore this though - so if 'addressDisplayName' doesn't exist just continue with the loop?不过,我该如何忽略这一点——所以如果“addressDisplayName”不存在,就继续循环? I've tried using a TRY but the code still stops executing.我试过使用 TRY 但代码仍然停止执行。

If you get an IndexError (with an index of '0') it means that your list is empty.如果您收到 IndexError(索引为“0”),则表示您的列表为空。 So it is one step in the path earlier (otherwise you'd get a KeyError if 'addressDisplayName' was missing from the dict).所以这是前面路径中的一个步骤(否则,如果字典中缺少“addressDisplayName”,你会得到一个 KeyError)。

You can check if the list has elements:您可以检查列表是否包含元素:

if dict_from_json['customerShowCommand']['customerAddressShowCommandSet']:
    # get the data

Otherwise you can indeed use try..except:否则你确实可以使用 try..except:

try:
    # get the data
except IndexError, KeyError:
    # handle missing data

try..except block should resolved your issue. try..except 块应该可以解决您的问题。

CustId=500
while (CustId<=510):
  
  print(CustId)

  # Part 1: Customer REST call:
  urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
  driver.get(urlg)

  soup = BeautifulSoup(driver.page_source,"lxml")

  dict_from_json = json.loads(soup.find("body").text)
  # print(dict_from_json)

  
 
  CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])
  try:
      Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]'addressDisplayName'])

  except:
      Addr ="NaN"

  CustId = CustId+1 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python 网页抓取“IndexError: list index out of range” - Python web scraping “IndexError: list index out of range” IndexError:列表索引超出范围(Python 网页抓取) - IndexError: list index out of range (Python web scraping) Web抓取Python:IndexError:列表索引超出范围 - Web scraping python: IndexError: list index out of range Web抓取:IndexError:列表索引超出范围 - Web scraping: IndexError: list index out of range 如何在读取和写入文件时对 Python IndexError:List Index out of range 进行排序 - How do I sort Python IndexError:List Index out of range when reading and writing with files 如何解决错误“IndexError:列表索引超出范围” - How do I solve the error, "IndexError: list index out of range" 当代码缺失值时,如何修复Web抓取Python代码“ IndexError:列表索引超出范围” - How to fix web scraping Python code “IndexError: list index out of range” when the code hits missing values Python:IndexError:列表索引超出范围(从具有3列的CSV读取) - Python: IndexError: list index out of range (reading from CSV with 3 columns) IndexError:列出索引超出范围python json - IndexError: list index out of range python json Python错误:IndexError列表索引超出范围 - Python Error: IndexError list index out of range
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM