![](/img/trans.png)
[英]Pandas: need to add a new column from a numpy array, but the length is longer than the dataframe's length
[英]Python - add a numpy array as column to a pandas dataframe with different length
我有一個多列的pandas dataframe df
。 列之一是Col1
,其中包含浮點值或NaN:
df
+----+------+-----+
| No | Col1 | ... |
+----+------+-----+
| 12 | 10 | ... |
| 23 | NaN | ... |
| 34 | 5 | ... |
| 45 | NaN | ... |
| 54 | 22 | ... |
+----+------+-----+
我在Col1
運行了一個函數,排除了像這樣的缺失值( NaN
):
StandardScaler().fit_transform(df.loc[pd.notnull(df[Col1]), [Col1]])
想象一下結果是一個像這樣的numpy.ndarray:
+-----+
| Ref |
+-----+
| 2 |
| 5 |
| 1 |
+-----+
請注意,此數組的長度與原始列Col1
長度不同。
我需要一種將Ref
列添加為df
的解決方案。 對於Col1
為NaN
每一行,新列Ref
也會獲得NaN
。 所需的輸出如下所示:
+----+------+-----+-----+
| No | Col1 | ... | Ref |
+----+------+-----+-----+
| 12 | 10 | ... | 2 |
| 23 | NaN | ... | NaN |
| 34 | 5 | ... | 5 |
| 45 | NaN | ... | NaN |
| 54 | 22 | ... | 1 |
+----+------+-----+-----+
我認為您可以分配給由相同布爾掩碼過濾的新列:
from sklearn.preprocessing import StandardScaler
mask = df['Col1'].notnull()
df.loc[mask, 'Ref'] = StandardScaler().fit_transform(df.loc[mask, ['Col1']])
print (df)
No Col1 Ref
0 12 10.0 -0.327089
1 23 NaN NaN
2 34 5.0 -1.027992
3 45 NaN NaN
4 54 22.0 1.355081
詳細說明 :
print (StandardScaler().fit_transform(df.loc[mask, ['Col1']]))
[[-0.32708852]
[-1.02799249]
[ 1.35508101]]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.