[英]Pandas: Create new column in DataFrame based on other column in DataFrame
[英]Create a new pandas dataframe column based on other column of the dataframe
我有一個包含 2 列的數據框:
'String' -> numpy 數組,如 [47, 0, 49, 12, 46]
“是等值線圖”-> 1 或 0
String Is Isogram
0 [47, 0, 49, 12, 46] 1
1 [43, 50, 22, 1, 13] 1
2 [10, 1, 24, 22, 16] 1
3 [2, 24, 3, 24, 51] 0
4 [40, 1, 41, 18, 3] 1
我想創建另一列,在 'String' 數組中附加值 'Is Isogram',如下所示:
String Is Isogram IsoString
0 [47, 0, 49, 12, 46] 1 [47, 0, 49, 12, 46, 1]
1 [43, 50, 22, 1, 13] 1 [43, 50, 22, 1, 13, 1]
2 [10, 1, 24, 22, 16] 1 [10, 1, 24, 22, 16, 1]
3 [2, 24, 3, 24, 51] 0 [2, 24, 3, 24, 51, 0]
4 [40, 1, 41, 18, 3] 1 [40, 1, 41, 18, 3, 1]
我已經嘗試使用帶有 lambda 的 apply 函數:
df[''IsoString] = df.apply(lambda x: np.append(x['String'].values, x['Is Isogram'].values, axis=1))
但它給我拋出了一個我不太理解的 KeyError
KeyError: ('String', 'occurred at index String')
我該如何解決這個問題?
有問題axis=1
被調用np.append
而不是.apply
函數:
df['IsoString'] = df.apply(lambda x: np.append(x['String'], x['Is Isogram']), axis=1)
如果String
中每個列表的長度相同,則使用numpy.hstack
更好/更快:
arr = np.hstack((np.array(df['String'].tolist()), df['Is Isogram'].values[:, None]))
print (arr)
[[47 0 49 12 46 1]
[43 50 22 1 13 1]
[10 1 24 22 16 1]
[ 2 24 3 24 51 0]
[40 1 41 18 3 1]]
df['IsoString'] = arr.tolist()
print (df)
String Is Isogram IsoString
0 [47, 0, 49, 12, 46] 1 [47, 0, 49, 12, 46, 1]
1 [43, 50, 22, 1, 13] 1 [43, 50, 22, 1, 13, 1]
2 [10, 1, 24, 22, 16] 1 [10, 1, 24, 22, 16, 1]
3 [2, 24, 3, 24, 51] 0 [2, 24, 3, 24, 51, 0]
4 [40, 1, 41, 18, 3] 1 [40, 1, 41, 18, 3, 1]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.