[英]Python: fill a row of a pandas dataframe with a column of an numpy array
I have a pandas dataframe (1413 rows) and a numpy array (1412 rows).我有一个熊猫数据框(1413 行)和一个 numpy 数组(1412 行)。
type(df1)
Out[193]: pandas.core.frame.DataFrame
df1.shape
Out[194]: (1413, 15)
type(arr1)
Out[195]: numpy.ndarray
arr1.shape
Out[196]: (1412, 3)
I would like to fill a column in the df1 with a column in arr1 + nan, but it does not work我想用 arr1 + nan 中的一列填充 df1 中的一列,但它不起作用
df1['aaa'] = np.vstack((np.nan, arr1[:,0]))
Could anyone let me know how to do it?谁能让我知道怎么做?
I have a pandas dataframe (1413 rows) and a numpy array (1412 rows).我有一个熊猫数据框(1413行)和一个numpy数组(1412行)。
type(df1)
Out[193]: pandas.core.frame.DataFrame
df1.shape
Out[194]: (1413, 15)
type(arr1)
Out[195]: numpy.ndarray
arr1.shape
Out[196]: (1412, 3)
I would like to fill a column in the df1 with a column in arr1 + nan, but it does not work我想用arr1 + nan中的列填充df1中的列,但是它不起作用
df1['aaa'] = np.vstack((np.nan, arr1[:,0]))
Could anyone let me know how to do it?谁能让我知道该怎么做?
Use numpy.hstack
for add one value to 1d
array:使用
numpy.hstack
用于添加一个价值1d
数组:
df1 = pd.DataFrame({'a': range(6)})
arr1 = np.arange(15).reshape(5,3)
print (arr1)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
df1['aaa'] = np.hstack((np.nan, arr1[:,0]))
print (df1)
a aaa
0 0 NaN
1 1 0.0
2 2 3.0
3 3 6.0
4 4 9.0
5 5 12.0
Another idea if possible non default index of DataFrame
is use Series
constructor with indexing df1.index
:如果可能,
DataFrame
另一个想法是使用带有索引df1.index
Series
构造df1.index
:
df1 = pd.DataFrame({'a': range(6)}, index=list('abcdef'))
arr1 = np.arange(15).reshape(5,3)
print (arr1)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
dif = df1.shape[0] - arr1.shape[0]
df1['aaa'] = pd.Series(arr1[:,0], index=df1.index[dif:])
print (df1)
a aaa
a 0 NaN
b 1 0.0
c 2 3.0
d 3 6.0
e 4 9.0
f 5 12.0
Last position:最后位置:
dif = df1.shape[0] - arr1.shape[0]
df1['aaa'] = pd.Series(arr1[:,0], index=df1.index[:-dif])
print (df1)
a aaa
a 0 0.0
b 1 3.0
c 2 6.0
d 3 9.0
e 4 12.0
f 5 NaN
EDIT:编辑:
arr1 = np.arange(15).reshape(5,3)
df1 = pd.DataFrame({'a': range(6)})
If select by 0
only get 1d
array with shape (6,)
, so is necessary numpy.hstack
:如果通过选择
0
只能得到1d
具有形状阵列(6,)
所以有必要numpy.hstack
:
a = np.hstack((np.nan, arr1[:,0]))
print (a)
[nan 0. 3. 6. 9. 12.]
print (a.shape)
(6,)
df1['aaa'] = a
If select by [0]
get 2d
array with dimensions MxN
with shape (6,1)
, so is possible use numpy.vstack
:如果通过
[0]
选择获得尺寸为MxN
且形状为(6,1)
2d
数组,则可以使用numpy.vstack
:
a1 = np.vstack((np.nan, arr1[:,[0]]))
print (a1)
[[nan]
[ 0.]
[ 3.]
[ 6.]
[ 9.]
[12.]]
print (a1.shape)
(6, 1)
df1['aaa1'] = a1
print (df1)
a aaa aaa1
0 0 NaN NaN
1 1 0.0 0.0
2 2 3.0 3.0
3 3 6.0 6.0
4 4 9.0 9.0
5 5 12.0 12.0
You can do this, here you have the result.你可以这样做,这里你有结果。 You add the column and the first line is NaN:
您添加列,第一行是 NaN:
df['aaa'] = pd.Series(ar1[:,0])
ea = np.empty(df.shape[1]).fill(np.nan)
df.loc[-1] = ea
df.index = df.index + 1
df = df.reset_index(drop=True).sort_values(by=['aaa'], na_position='first')
Here is your DataFrame:这是您的数据帧:
c1 c2 c3
0 1 2 3
1 10 20 30
Here is the array:这是数组:
[[ 5 55]
[ 50 550]]
And the result is this :结果是这样的:
c1 c2 c3 aaa
2 NaN NaN NaN NaN
0 1.0 2.0 3.0 5.0
1 10.0 20.0 30.0 50.0
你可以使用np.append
df1['aaa'] = np.append(np.nan, arr1[:,0])
While I can see several other answers, none of them have really addressed the problem at hand.虽然我可以看到其他几个答案,但没有一个真正解决了手头的问题。 Intuitively, your approach is okay;
直觉上,你的方法没问题; you're stacking
nan
vertically on a column array.你在列阵列上垂直堆叠
nan
。
df1['aaa'] = np.vstack((np.nan, arr1[:,0]))
It should work, but it doesn't.它应该工作,但它没有。 The small problem here is that
vstack
searches for a column dimension.这里的小问题是
vstack
搜索列维度。 arr1[:,0]
has the shape (1412, )
; arr1[:,0]
的形状为(1412, )
; it doesn't have a second dimension.它没有第二维。 Simple reshaping it to
(1412,1)
will make vstack
work just fine.简单地将其整形为
(1412,1)
将使vstack
工作得很好。
df1['aaa'] = np.vstack((np.nan, arr1[:,0].reshape(-1,1)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.