I want to apply a function to each row in a dataframe and get a series. It works if the function returns a number, but does not work when the function returns a dict.
In [31]: d
Out[31]:
a b
bar one -0.185677 -0.554356
two -0.457943 -1.094836
baz one -0.731338 -0.027821
two -1.061098 0.258291
foo one -1.392160 2.287989
two 2.010208 -1.350581
qux one -0.792229 -0.323397
two -1.063265 0.048641
In [32]: d.apply(lambda x: x.a+x.b, axis=1)
Out[32]:
bar one -0.740034
two -1.552779
baz one -0.759159
two -0.802806
foo one 0.895829
two 0.659626
qux one -1.115627
two -1.014624
dtype: float64
In [33]: d.apply(lambda x: {"boo": x.a}, axis=1)// I want a series of dict
Out[33]:
a b
bar one NaN NaN
two NaN NaN
baz one NaN NaN
two NaN NaN
foo one NaN NaN
two NaN NaN
qux one NaN NaN
two NaN NaN
It does not seem to matter if the reduce argument of apply is None/True/False. Pandas seems to be too smart by accessing value of keys "a" and "b" in the dict {"boo":xa}.
An answer is to not try to perform the calculation and data type translation in one step. Calculate a series of values, then restructure into the dictionary.
import pandas
import random
mi1 = ['bar','baz']
mi2 = ['one', 'two']
data = [[s,t, random.random(), random.random()] for s in mi1 for t in mi2]
df = pandas.DataFrame(data, columns=['i1', 'i2', 'a', 'b'])
df.set_index(['i1', 'i2'], inplace=True)
print(df)
Out
a b
i1 i2
bar one 0.438596 0.734058
two 0.183522 0.272922
baz one 0.581694 0.522173
two 0.776081 0.941120
In
# calculate the series
# the data type is still floats
sr = df.apply(lambda row: row.a + row.b, axis=1)
# construct the series of objects
result = sr.apply(lambda x: {"boo":x})
print(result)
Out
i1 i2
bar one {'boo': 1.17265404527}
two {'boo': 0.456443829892}
baz one {'boo': 1.10386719117}
two {'boo': 1.71720149706}
dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.