[英]List of lists to dictionary to pandas DataFrame
我正在嘗試擬合以下數據:
[['Manufacturer: Hyundai',
'Model: Tucson',
'Mileage: 258000 km',
'Registered: 07/2019'],
['Manufacturer: Mazda',
'Model: 6',
'Year: 2014',
'Registered: 07/2019']]
到熊貓DataFrame。
並非所有標簽都出現在每個記錄中,例如,某些記錄具有“里程”,而有些則沒有。 我一共有26個功能,而幾乎所有功能都很少。
我想構造將在列中包含要素的pandas DataFrame,如果要素不存在,則內容應為“ NaN”。
我有
colnames=['Manufacturer', 'Model', 'Mileage', 'Registered', 'Year'...(all 26 features here)]
df = pd.read_csv("./data/output.csv", sep=",", names=colnames, header=None)
很少有先決條件列能提供預期的輸出,但是在涉及可選功能時,缺少數據會導致之后的功能在錯誤的列下出現。 僅當所有功能均存在時,記錄才能正確映射。
我忘了提及某些缺少價值的功能,這些功能也沒有“:”但出現在列表中。 因此,在這2種情況下:
兩種情況的分配均應為“ NaN”。
使用嵌套列表DataFrame
字典列表,如果缺少相同的鍵,則傳遞給DataFrame
構造函數NaN
:
L = [['Manufacturer: Hyundai',
'Model: Tucson',
'Mileage: 258000 km',
'Registered: 07/2019'],
['Manufacturer: Mazda',
'Model: 6',
'Year: 2014',
'Registered: 07/2019']]
df = pd.DataFrame([dict(y.split(':') for y in x) for x in L])
print (df)
Manufacturer Mileage Model Registered Year
0 Hyundai 258000 km Tucson 07/2019 NaN
1 Mazda NaN 6 07/2019 2014
編輯:您可以使用.split(maxsplit=1)
來按第一個空格進行分割:
L = [['Manufacturer Hyundai',
'Model Tucson',
'Mileage 258000 km',
'Registered 07/2019'],
['Manufacturer Mazda',
'Model 6',
'Year 2014',
'Registered 07/2019']]
df = pd.DataFrame([dict(y.split(maxsplit=1) for y in x) for x in L])
print (df)
Manufacturer Mileage Model Registered Year
0 Hyundai 258000 km Tucson 07/2019 NaN
1 Mazda NaN 6 07/2019 2014
編輯:
L = [['Manufacturer Hyundai',
'Model Tucson',
'Mileage 258000 km',
'Registered 07/2019'],
['Manufacturer Mazda',
'Model 6',
'Year 2014',
'Registered 07/2019',
'Additional equipment aaa']]
words2 = ['Additional equipment']
L1 = []
for x in L:
di = {}
for y in x:
for word in words2:
if set(word.split(maxsplit=2)[:2]) < set(y.split()):
i, j, k = y.split(maxsplit=2)
di['_'.join([i, j])] = k
else:
i, j = y.split(maxsplit=1)
di[i] = j
L1.append(di)
df = pd.DataFrame(L1)
print (df)
Additional_equipment Manufacturer Mileage Model Registered Year
0 NaN Hyundai 258000 km Tucson 07/2019 NaN
1 aaa Mazda NaN 6 07/2019 2014
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.