简体   繁体   English

将较短长度的Numpy数组连接到Pandas数据框

[英]Joining Shorter Length Numpy Array to Pandas Dataframe

I have a pandas dataframe with 506 rows. 我有一个506行的pandas数据框。 I have a numpy array with 501 elements that are calculated from the dataframe. 我有一个numpy数组,其中包含从数据帧计算的501个元素。

I would like to join the numpy array to the dataframe, keeping the index of the dataframe and starting the index of the numpy array with the first index value of the dataframe. 我想将numpy数组连接到数据框,保持数据框的索引,并从数据框的第一个索引值开始numpy数组的索引。

The problem is that because the numpy array has a different length and no notion of row indexes, the join operation fails. 问题在于,因为numpy数组的长度不同,并且没有行索引的概念,所以联接操作失败。

Is there a clever way to solve this? 有解决这个问题的聪明方法吗?

I'd construct a Series from the np array, and then construct a new Series but pass the target df's index, this effectively reindexes the existing Series , introducing NaN values where there are no row values, this will align correctly against the target df: 我将从np数组构造一个Series ,然后构造一个新Series但传递目标df的索引,这有效地重新索引了现有Series ,在没有行值的地方引入了NaN值,这将与目标df正确对齐:

In [51]:    
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
s = pd.Series(np.arange(3))
s1 = pd.Series(s, index=df.index)
s1

Out[51]:
0    0.0
1    1.0
2    2.0
3    NaN
4    NaN
dtype: float64

In [53]:
df['d'] = s1
df

Out[53]:
          a         b         c    d
0  0.990381  2.583867  0.018435  0.0
1  0.867695 -0.958220 -0.351783  1.0
2  0.476210 -1.015887  1.285303  2.0
3 -0.198863 -2.514740  1.228772  NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM