简体   繁体   中英

Groupby mulitple columns of different dtypes and aggregate to list

The input df is as shown

Name  num1    num2   key  filter1  filter2  value
TOM    1.1     2.1    a    True     False   1.0
TOM    1.1     2.1    b    True     False   2.0
TOM    1.1     2.1    c    True     False   3.0
TOM    1.1     2.1    d    True     False   4.0
SAM    1.2     2.1    a    False    True    5.0
SAM    1.2     2.1    b    False    True    6.0

The corresponding dtypes of the df

Name       object
num1      float64
num2      float64
key        object
filter1      bool
filter2      bool
value     float64
dtype: object

I did the following and facing an exception with the following with the aggregation done as shown.I get an exception as below

df2 = df.groupby(['Name','num1','num2'],as_index=False)['key','filter1','filter2','value'].agg(list)

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\groupby.py", line 4036, in aggregate
    return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\groupby.py", line 3468, in aggregate
    result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\base.py", line 634, in _aggregate
    _axis=_axis), None
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\base.py", line 691, in _aggregate_multiple_funcs
    raise ValueError("no results")
ValueError: no results

Desired output:

  Name  num1  num2           key            filter1                     filter2                     value
0  TOM   1.1   2.1  [a, b, c, d]  [True, True, True, True]      [False, False, False, False]    [1.0, 2.0, 3.0, 4.0]
1  SAM   1.2   2.1        [a, b]            [False, False]                      [True, True]              [5.0, 6.0]

Also I have tried the below, This gives Function does not reduce

df3 = df.groupby(['Name','num1','num2'], as_index=False)['key','filter1','filter2','value'].agg(lambda x: list(x)))

please let me know the mistake I do and how to fix it

希望这可以帮助。

df3 = df.groupby(['Name','num1','num2'], as_index=False).agg(pd.Series.tolist)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM