Sum DataFrame Columns without Dropping Columns

Question

I have this dataframe:

>>> d = pd.DataFrame(
 { "a": [1,1]
 , "b": [2,2]
 , "c": [4,5]
 , "d": [pd.Timedelta(hours=6),pd.Timedelta(hours=7)]
 , "e": [12.1,13.3]
 })
>>> d = d.set_index(["a","b","c"])
>>> d
                    d     e
a b c
1 2 4 0 days 06:00:00  12.1
    5 0 days 07:00:00  13.3
>>> d.dtypes
d    timedelta64[ns]
e            float64
dtype: object

I want a sum of each column, and I will need one version with skipna=True and one version with skipna=False . I expect this,

>>> d.sum(level=["a","b"])
                  d     e
a b
1 2 0 days 13:00:00  25.4

but I get this.

>>> d.sum(level=["a","b"])
        e
a b
1 2  25.4

One column has been dropped.

More info:

>>> pd.__version__
'1.2.3'
>>> sys.version_info
sys.version_info(major=3, minor=8, micro=8, releaselevel='final', serial=0)

Answer 1

Work Around #1 `groupby` / `agg`

d.groupby(level=['a', 'b']).agg({'d': 'sum', 'e': 'sum'})

                  d     e
a b                      
1 2 0 days 13:00:00  25.4

Work Around #2 `apply`

d.apply(pd.Series.sum, level=['a', 'b'])

                  d     e
a b                      
1 2 0 days 13:00:00  25.4

Note that you can pass other parameters as well

d.apply(pd.Series.sum, level=['a', 'b'], skipna=True)

                  d     e
a b                      
1 2 0 days 13:00:00  25.4

Work Around #3 `groupby` / `numeric_only=False`

Per @QuanhHoang

d.groupby(['a', 'b']).sum(numeric_only=False)

                  d     e
a b                      
1 2 0 days 13:00:00  25.4

Unfortunately, d.sum(level=['a', 'b'], numeric_only=False) still doesn't work.

Well I think that is strange!

What I think is happening is that Pandas is assuming that it isn't a numeric type and therefore not worthy of 'sum' .

However, I checked

np.issubdtype(d.dtypes.d, np.number)

True

Sooo /shrug IDK what is going on. I don't feel like looking too deep.

Sum DataFrame Columns without Dropping Columns

Question

1 answers

solution1
2 ACCPTED 2021-03-24 03:26:17

Work Around #1 `groupby` / `agg`

Work Around #2 `apply`

Work Around #3 `groupby` / `numeric_only=False`

Sum DataFrame Columns without Dropping Columns

Question

1 answers

solution1 2 ACCPTED 2021-03-24 03:26:17

Work Around #1 groupby / agg

Work Around #2 apply

Work Around #3 groupby / numeric_only=False

solution1
2 ACCPTED 2021-03-24 03:26:17

Work Around #1 `groupby` / `agg`

Work Around #2 `apply`

Work Around #3 `groupby` / `numeric_only=False`