[英]Ignoring errors while parsing list of lists to dictionary to pandas DataFrame
如何告訴熊貓忽略列表中不正確的詞典項目?
為了簡單起見,如果我有上一個問題的第一個版本解決方案的情況:
L =[['Manufacturer: Hyundai',
'Model: Tucson',
'Mileage: 258000 km',
'Registered: 07/2019'],
['Manufacturer: Mazda',
'Model: 6',
'Year: 2014',
'Registered: 07/2019',
'Comfort',
'Safety']]
df = pd.DataFrame([dict(y.split(':') for y in x) for x in L])
print (df)
第二個dict項目的最后兩個項目缺少值(“舒適”和“安全”),但它們也缺少“:”,因此熊貓拋出了:
ValueError: dictionary update sequence element #5 has length 1; 2 is required
如何告訴熊貓忽略這些類型的錯誤並繼續解析列表?
只需添加一些if
條件。
pd.DataFrame([
dict(y.split(':') for y in x if ':' in y) for x in L])
Manufacturer Mileage Model Registered Year
0 Hyundai 258000 km Tucson 07/2019 NaN
1 Mazda NaN 6 07/2019 2014
如果要包括那些值NaN,則改變if
一個if-else
的理解中。
pd.DataFrame([
dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])
Comfort Manufacturer Mileage Model Registered Safety Year
0 NaN Hyundai 258000 km Tucson 07/2019 NaN NaN
1 NaN Mazda NaN 6 07/2019 NaN 2014
如果可以使用no :
是鍵的值,則添加if-else
:
df = pd.DataFrame([dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])
print (df)
Comfort Manufacturer Mileage Model Registered Safety Year
0 NaN Hyundai 258000 km Tucson 07/2019 NaN NaN
1 NaN Mazda NaN 6 07/2019 NaN 2014
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.