Python：用一列numpy数组填充pandas数据帧的一行

Question

I have a pandas dataframe (1413 rows) and a numpy array (1412 rows).我有一个熊猫数据框（1413 行）和一个 numpy 数组（1412 行）。

type(df1)
Out[193]: pandas.core.frame.DataFrame

df1.shape
Out[194]: (1413, 15)

type(arr1)
Out[195]: numpy.ndarray

arr1.shape
Out[196]: (1412, 3)

I would like to fill a column in the df1 with a column in arr1 + nan, but it does not work我想用 arr1 + nan 中的一列填充 df1 中的一列，但它不起作用

df1['aaa'] = np.vstack((np.nan, arr1[:,0]))

Could anyone let me know how to do it?谁能让我知道怎么做？

Answer 1

I have a pandas dataframe (1413 rows) and a numpy array (1412 rows).我有一个熊猫数据框（1413行）和一个numpy数组（1412行）。

type(df1)
Out[193]: pandas.core.frame.DataFrame

df1.shape
Out[194]: (1413, 15)

type(arr1)
Out[195]: numpy.ndarray

arr1.shape
Out[196]: (1412, 3)

I would like to fill a column in the df1 with a column in arr1 + nan, but it does not work我想用arr1 + nan中的列填充df1中的列，但是它不起作用

df1['aaa'] = np.vstack((np.nan, arr1[:,0]))

Could anyone let me know how to do it?谁能让我知道该怎么做？

Answer 2

Use numpy.hstack for add one value to 1d array:使用numpy.hstack用于添加一个价值1d数组：

df1 = pd.DataFrame({'a': range(6)})

arr1 = np.arange(15).reshape(5,3)
print (arr1)
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]

df1['aaa'] = np.hstack((np.nan, arr1[:,0]))
print (df1)
   a   aaa
0  0   NaN
1  1   0.0
2  2   3.0
3  3   6.0
4  4   9.0
5  5  12.0

Another idea if possible non default index of DataFrame is use Series constructor with indexing df1.index :如果可能， DataFrame另一个想法是使用带有索引df1.index Series构造df1.index ：

df1 = pd.DataFrame({'a': range(6)}, index=list('abcdef'))

arr1 = np.arange(15).reshape(5,3)
print (arr1)
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]

dif = df1.shape[0] - arr1.shape[0]
df1['aaa'] = pd.Series(arr1[:,0], index=df1.index[dif:])
print (df1)
   a   aaa
a  0   NaN
b  1   0.0
c  2   3.0
d  3   6.0
e  4   9.0
f  5  12.0

Last position:最后位置：

dif = df1.shape[0] - arr1.shape[0]
df1['aaa'] = pd.Series(arr1[:,0], index=df1.index[:-dif])
print (df1)
   a   aaa
a  0   0.0
b  1   3.0
c  2   6.0
d  3   9.0
e  4  12.0
f  5   NaN

EDIT:编辑：

arr1 = np.arange(15).reshape(5,3)
df1 = pd.DataFrame({'a': range(6)})

If select by 0 only get 1d array with shape (6,) , so is necessary numpy.hstack :如果通过选择0只能得到1d具有形状阵列(6,)所以有必要numpy.hstack ：

a = np.hstack((np.nan, arr1[:,0]))
print (a)
[nan  0.  3.  6.  9. 12.]

print (a.shape)
(6,)

df1['aaa'] = a

If select by [0] get 2d array with dimensions MxN with shape (6,1) , so is possible use numpy.vstack :如果通过[0]选择获得尺寸为MxN且形状为(6,1) 2d数组，则可以使用numpy.vstack ：

a1 = np.vstack((np.nan, arr1[:,[0]]))
print (a1)
[[nan]
 [ 0.]
 [ 3.]
 [ 6.]
 [ 9.]
 [12.]]

print (a1.shape)
(6, 1)


df1['aaa1'] = a1
print (df1)
   a   aaa  aaa1
0  0   NaN   NaN
1  1   0.0   0.0
2  2   3.0   3.0
3  3   6.0   6.0
4  4   9.0   9.0
5  5  12.0  12.0

Answer 3

You can do this, here you have the result.你可以这样做，这里你有结果。 You add the column and the first line is NaN:您添加列，第一行是 NaN：

df['aaa'] = pd.Series(ar1[:,0])
ea = np.empty(df.shape[1]).fill(np.nan)
df.loc[-1] = ea
df.index = df.index + 1
df = df.reset_index(drop=True).sort_values(by=['aaa'], na_position='first')

Here is your DataFrame:这是您的数据帧：

c1  c2  c3
0   1   2   3
1  10  20  30

Here is the array:这是数组：

[[  5  55]
 [ 50 550]]

And the result is this :结果是这样的：

     c1    c2    c3   aaa
2   NaN   NaN   NaN   NaN
0   1.0   2.0   3.0   5.0
1  10.0  20.0  30.0  50.0

Answer 4

你可以使用np.append

df1['aaa'] = np.append(np.nan, arr1[:,0])

Answer 5

While I can see several other answers, none of them have really addressed the problem at hand.虽然我可以看到其他几个答案，但没有一个真正解决了手头的问题。 Intuitively, your approach is okay;直觉上，你的方法没问题； you're stacking nan vertically on a column array.你在列阵列上垂直堆叠nan 。

df1['aaa'] = np.vstack((np.nan, arr1[:,0]))

It should work, but it doesn't.它应该工作，但它没有。 The small problem here is that vstack searches for a column dimension.这里的小问题是vstack搜索列维度。 arr1[:,0] has the shape (1412, ) ; arr1[:,0]的形状为(1412, ) ； it doesn't have a second dimension.它没有第二维。 Simple reshaping it to (1412,1) will make vstack work just fine.简单地将其整形为(1412,1)将使vstack工作得很好。

df1['aaa'] = np.vstack((np.nan, arr1[:,0].reshape(-1,1)))

Python：用一列numpy数组填充pandas数据帧的一行

问题描述

4 个解决方案

解决方案1
0 2020-03-24 08:41:59

解决方案2
0 2020-03-24 08:47:54

解决方案3
0 2020-03-24 08:48:58

解决方案4
0 2020-03-24 09:06:12

解决方案5
0 2020-03-24 11:08:55

Python：用一列numpy数组填充pandas数据帧的一行

问题描述

4 个解决方案

解决方案1 0 2020-03-24 08:41:59

解决方案2 0 2020-03-24 08:47:54

解决方案3 0 2020-03-24 08:48:58

解决方案4 0 2020-03-24 09:06:12

解决方案5 0 2020-03-24 11:08:55

解决方案1
0 2020-03-24 08:41:59

解决方案2
0 2020-03-24 08:47:54

解决方案3
0 2020-03-24 08:48:58

解决方案4
0 2020-03-24 09:06:12

解决方案5
0 2020-03-24 11:08:55