简体   繁体   中英

Pandas agg function different behaviours depending on call definition

I don't understand the agg behaviour. See examples below and expected result.

pd.DataFrame({'d': [{'a': 1}, {'b': 2}]}).agg(list)
Out[372]: 
          d
0  {'a': 1}
1  {'b': 2}
pd.DataFrame({'d': [{'a': 1}, {'b': 2}]}).agg(lambda col: list(col))
Out[373]: 
          d
0  {'a': 1}
1  {'b': 2}
pd.DataFrame({'d': [{'a': 1}, {'b': 2}]}).agg({'d': list})
Out[374]: 
     d
0  [a]
1  [b]
pd.DataFrame({'d': [{'a': 1}, {'b': 2}]}).agg({'d': lambda col: list(col)})
Out[375]: 
     d
0  [a]
1  [b]

Expected result is:

pd.DataFrame({'d': [list(pd.DataFrame({'d': [{'a': 1}, {'b': 2}]}).d)]})
Out[379]: 
                      d
0  [{'a': 1}, {'b': 2}]

You might need another DataFrame :

>>> df = pd.DataFrame({'d': [{'a': 1}, {'b': 2}]})
>>> pd.DataFrame([df.values], columns=df.columns)
                          d
0  [[{'a': 1}], [{'b': 2}]]
>>> 

agg isn't able to do that, it aggregates and does X operation on Y column, it doesn't "aggregate" values...

For your other example, I would do:

>>> pd.DataFrame(df.apply(lambda x: [df[x.name].values])).T.apply(lambda x: x.str[0])
                      d                     e
0  [{'a': 1}, {'b': 2}]  [{'a': 1}, {'b': 2}]
>>> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM