簡體   English   中英

在將列表解析為字典到Pandas DataFrame時忽略錯誤

[英]Ignoring errors while parsing list of lists to dictionary to pandas DataFrame

如何告訴熊貓忽略列表中不正確的詞典項目?

為了簡單起見,如果我有上一個問題的第一個版本解決方案的情況:

L =[['Manufacturer: Hyundai',
  'Model: Tucson',
  'Mileage: 258000 km',
  'Registered: 07/2019'],
 ['Manufacturer: Mazda',
  'Model: 6',
  'Year: 2014',
  'Registered: 07/2019',
  'Comfort',
  'Safety']]

df = pd.DataFrame([dict(y.split(':') for y in x) for x in L])
print (df)

第二個dict項目的最后兩個項目缺少值(“舒適”和“安全”),但它們也缺少“:”,因此熊貓拋出了:

ValueError: dictionary update sequence element #5 has length 1; 2 is required

如何告訴熊貓忽略這些類型的錯誤並繼續解析列表?

只需添加一些if條件。

pd.DataFrame([
    dict(y.split(':') for y in x if ':' in y) for x in L])

  Manufacturer     Mileage    Model Registered   Year
0      Hyundai   258000 km   Tucson    07/2019    NaN
1        Mazda         NaN        6    07/2019   2014

如果要包括那些值NaN,則改變if一個if-else的理解中。

pd.DataFrame([
    dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])


   Comfort Manufacturer     Mileage    Model Registered  Safety   Year
0      NaN      Hyundai   258000 km   Tucson    07/2019     NaN    NaN
1      NaN        Mazda         NaN        6    07/2019     NaN   2014

如果可以使用no :是鍵的值,則添加if-else

df = pd.DataFrame([dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])
print (df)
       Comfort Manufacturer     Mileage    Model Registered  Safety   Year
0      NaN      Hyundai   258000 km   Tucson    07/2019     NaN    NaN
1      NaN        Mazda         NaN        6    07/2019     NaN   2014

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM