Joining Shorter Length Numpy Array to Pandas Dataframe

Question

I have a pandas dataframe with 506 rows. I have a numpy array with 501 elements that are calculated from the dataframe.

I would like to join the numpy array to the dataframe, keeping the index of the dataframe and starting the index of the numpy array with the first index value of the dataframe.

The problem is that because the numpy array has a different length and no notion of row indexes, the join operation fails.

Is there a clever way to solve this?

Answer 1

I'd construct a Series from the np array, and then construct a new Series but pass the target df's index, this effectively reindexes the existing Series , introducing NaN values where there are no row values, this will align correctly against the target df:

In [51]:    
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
s = pd.Series(np.arange(3))
s1 = pd.Series(s, index=df.index)
s1

Out[51]:
0    0.0
1    1.0
2    2.0
3    NaN
4    NaN
dtype: float64

In [53]:
df['d'] = s1
df

Out[53]:
          a         b         c    d
0  0.990381  2.583867  0.018435  0.0
1  0.867695 -0.958220 -0.351783  1.0
2  0.476210 -1.015887  1.285303  2.0
3 -0.198863 -2.514740  1.228772  NaN

Joining Shorter Length Numpy Array to Pandas Dataframe

Question

1 answers

solution1
1 ACCPTED 2016-07-11 14:39:27

Joining Shorter Length Numpy Array to Pandas Dataframe

Question

1 answers

solution1 1 ACCPTED 2016-07-11 14:39:27

solution1
1 ACCPTED 2016-07-11 14:39:27