简体   繁体   中英

Stop Pandas from rotating results from groupby-apply when there is one group

I have some code that first selects data based on a certain criteria then it does a groupby-apply on a Pandas dataframe. Occasionally, the data only has 1 group that matches the criteria. In this case, Pandas will return a row vector rather than a column vector. Example below:

In [50]: x = pd.DataFrame([(round(i/2, 0), i, i) for i in range(0, 10)], column
    ...: s=['a', 'b', 'c'])

In [51]: x
Out[51]:
     a  b  c
0  0.0  0  0
1  0.0  1  1
2  1.0  2  2
3  2.0  3  3
4  2.0  4  4
5  2.0  5  5
6  3.0  6  6
7  4.0  7  7
8  4.0  8  8
9  4.0  9  9

In [52]: y = x.loc[x.a == 0.0].groupby('a').apply(lambda x: x.b / x.c)

In [53]: y
Out[53]:
      0    1
a
0.0 NaN  1.0

y in the above example is a row vector with datatype pandas.DataFrame. If the .loc selection has two or more classes, it will produce a column vector.

In [54]: y = x.loc[x.a <= 1.0].groupby('a').apply(lambda x: x.b / x.c)

In [55]: y
Out[55]:
a
0.0  0    NaN
     1    1.0
1.0  2    1.0
dtype: float64

Any idea how I can make the two behaviour consistent? Ultimately, the column vector is what I want.

Thanks

There's no way to do this in one step, unfortunately. You can, however, do this in two steps, by querying ngroups and reshaping your result accordingly.

g = x.loc[...].groupby('a')
y = g.apply(lambda x: x.b / x.c)

if g.ngroups == 1:
    y = y.T

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM