[英]Ignoring errors while parsing list of lists to dictionary to pandas DataFrame
How to tell pandas to ignore incorrect dictionary items in list? 如何告诉熊猫忽略列表中不正确的词典项目?
For case of simplicity, if I have case from 1st version solution from previous question: 为了简单起见,如果我有上一个问题的第一个版本解决方案的情况:
L =[['Manufacturer: Hyundai',
'Model: Tucson',
'Mileage: 258000 km',
'Registered: 07/2019'],
['Manufacturer: Mazda',
'Model: 6',
'Year: 2014',
'Registered: 07/2019',
'Comfort',
'Safety']]
df = pd.DataFrame([dict(y.split(':') for y in x) for x in L])
print (df)
Second dict item have 2 last items missing values ('Comfort' and 'Safety') but they are also missing ":" therefore pandas is throwing : 第二个dict项目的最后两个项目缺少值(“舒适”和“安全”),但它们也缺少“:”,因此熊猫抛出了:
ValueError: dictionary update sequence element #5 has length 1; 2 is required
How to tell pandas to ignore these type of errors and proceed with parsing of list? 如何告诉熊猫忽略这些类型的错误并继续解析列表?
Just add a little if
condition. 只需添加一些
if
条件。
pd.DataFrame([
dict(y.split(':') for y in x if ':' in y) for x in L])
Manufacturer Mileage Model Registered Year
0 Hyundai 258000 km Tucson 07/2019 NaN
1 Mazda NaN 6 07/2019 2014
If you want to include those values as NaN, then change the if
to an if-else
inside the comprehension. 如果要包括那些值NaN,则改变
if
一个if-else
的理解中。
pd.DataFrame([
dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])
Comfort Manufacturer Mileage Model Registered Safety Year
0 NaN Hyundai 258000 km Tucson 07/2019 NaN NaN
1 NaN Mazda NaN 6 07/2019 NaN 2014
If values with no :
are keys is possible add if-else
: 如果可以使用no
:
是键的值,则添加if-else
:
df = pd.DataFrame([dict(y.split(':') if ':' in y else (y, np.nan) for y in x) for x in L])
print (df)
Comfort Manufacturer Mileage Model Registered Safety Year
0 NaN Hyundai 258000 km Tucson 07/2019 NaN NaN
1 NaN Mazda NaN 6 07/2019 NaN 2014
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.