Python Web 抓取錯誤 - 從 JSON 讀取 - IndexError：列表索引超出范圍 - 我該如何忽略

Question

我正在通過 Python \ Selenium \ Chrome 無頭驅動程序執行 web 抓取。 我正在閱讀 JSON 的結果——這是我的代碼：

CustId=500
while (CustId<=510):
  
  print(CustId)

  # Part 1: Customer REST call:
  urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
  driver.get(urlg)

  soup = BeautifulSoup(driver.page_source,"lxml")

  dict_from_json = json.loads(soup.find("body").text)
  # print(dict_from_json)

  #try:
 
  CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])

  # Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]['addressDisplayName'])

  writefunction()

  CustId = CustId+1

問題是有時“addressDisplayName”會出現在結果集中，有時不會。 如果不是，它會出現以下錯誤：

IndexError: list index out of range

這是有道理的，因為它不存在。 不過，我該如何忽略這一點——所以如果“addressDisplayName”不存在，就繼續循環？ 我試過使用 TRY 但代碼仍然停止執行。

Answer 1

如果您收到 IndexError（索引為“0”），則表示您的列表為空。 所以這是前面路徑中的一個步驟（否則，如果字典中缺少“addressDisplayName”，你會得到一個 KeyError）。

您可以檢查列表是否包含元素：

if dict_from_json['customerShowCommand']['customerAddressShowCommandSet']:
    # get the data

否則你確實可以使用 try..except：

try:
    # get the data
except IndexError, KeyError:
    # handle missing data

Answer 2

try..except 塊應該可以解決您的問題。

CustId=500
while (CustId<=510):
  
  print(CustId)

  # Part 1: Customer REST call:
  urlg = f'https://mywebsite/customerRest/show/?id={CustId}'
  driver.get(urlg)

  soup = BeautifulSoup(driver.page_source,"lxml")

  dict_from_json = json.loads(soup.find("body").text)
  # print(dict_from_json)

  
 
  CustID = (dict_from_json['customerAddressCreateCommand']['customerId'])
  try:
      Addr = (dict_from_json['customerShowCommand']['customerAddressShowCommandSet'][0]'addressDisplayName'])

  except:
      Addr ="NaN"

  CustId = CustId+1

Python Web 抓取錯誤 - 從 JSON 讀取 - IndexError：列表索引超出范圍 - 我該如何忽略

問題描述

2 個解決方案

解決方案1
1 2022-04-21 11:33:50

解決方案2
1 已采納 2022-04-21 11:38:09

Python Web 抓取錯誤 - 從 JSON 讀取 - IndexError：列表索引超出范圍 - 我該如何忽略

問題描述

2 個解決方案

解決方案1 1 2022-04-21 11:33:50

解決方案2 1 已采納 2022-04-21 11:38:09

解決方案1
1 2022-04-21 11:33:50

解決方案2
1 已采納 2022-04-21 11:38:09