Goal: I have two Pandas Series
. On each I want to apply a function that gives me some summarizing statistic for the column (like sum
, count
and so on). All this is embedded in a for each` loop. Eg:
DataFrame1
Id V1 V2
0 3 2
1 2 1
DataFrame2
Id T1 T2
0 4 2
1 5 2
The result (on a count task) suppose to be:
DataFrameGoal
Id V1 V2 T1 T2
0 2 2 2 2
My code works fine so for but the solution I get is:
DataFrameGoal
Id V1 V2 T1 T2
0 2 2 NaN NaN
1 NaN NaN 2 2
My code:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a' : np.random.randn(6),
'b' : np.random.randn(6),
'c' : np.random.randn(6)})
df2 = pd.DataFrame({'d' : np.random.randn(6),
'e' : np.random.randn(6),
'f' : np.random.randn(6)})
def mysum(col):
return col.count()
lst = []
lst.append(df1)
lst.append(df2)
myDf = pd.DataFrame()
for el in lst:
test = el.apply(lambda cols: mysum(cols))
myDf = myDf.append(test, ignore_index=True)
print(myDf)
Can anyone help me with getting the result I am aiming for? I also tried .assign
but this could not solve my problem as well. PS: I know that simple things like count or sum can be accomplished quite easy but I have some complicated task and this is just an easy example.
Try this
pd.concat([df1,df2], axis=1)
And then apply whatever function you want to.
It's hard to say if the problem is from concatenating dataframes or form mySum()
. But you can try:
myDf = (pd.concat(el.apply(lambda cols: mySum(cols))
for el in [df1,df2])
.to_frame().T)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.