I have a series of numpy arrays, and would like to create a dataframe column from it. Specifically, I have a dataframe that looks like this:
In [298]: df = pd.DataFrame({'name': ['A','A','B','B'], 'value': [1,2,3,4]})
In [299]: df
Out[299]:
name value
0 A 1
1 A 2
2 B 3
3 B 4
I now calculate the cumulative integral per 'name' like this:
In [300]: g = df.groupby('name')
In [301]: r = g.apply(lambda x: np.insert(integrate.cumtrapz(x.value), 0, [0]))
In [302]: r
Out[302]:
name
A [0.0, 1.5]
B [0.0, 3.5]
dtype: object
The type of r and elements of r are:
In [303]: type(r)
Out[303]: pandas.core.series.Series
In [304]: type(r[0])
Out[304]: numpy.ndarray
I would like to add this result to the original dataframe, achieving:
In [308]: df['cumint'] = np.append(r[0], r[1])
In [309]: df
Out[309]:
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
What is the best way of achieving this result.
Your series contains numpy arrays so you can concatenate the elements of the series into one long numpy array and set the new column to this array:
df['cumint'] = np.concatenate(r, axis=0)
Result:
>> print(df)
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
You can use transform
instead of apply
here like to get the results as a series:
df['cumint']=(df.groupby('name')['value'].
transform(lambda x: np.insert(integrate.cumtrapz(x), 0, [0])))
#or df['cumint']= g['value'].transform(lambda x: np.insert(integrate.cumtrapz(x), 0, [0]))
print(df)
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.