简体   繁体   English

如何使用 pandas groupby.count() 作为条件

[英]How can I use pandas groupby.count() for a condition

I have a dataframe df with two columns, Ticker and Trade Results.我有一个 dataframe df ,有两列,Ticker 和 Trade Results。

I want to create a new dataframe, with three columns - Ticker, Number of Trades, Profitable Trades.我想创建一个新的 dataframe,包含三列 - 代码、交易数量、盈利交易。

I have used the groupby and count function to get the Number of Trades column, this works fine.我使用groupbycount function 来获得交易数量列,这很好用。

My problem is with the third column, Profitable Trades, where the Trade Result is > 0, I have not found a way to get in this condition.我的问题是第三列获利交易,其中交易结果 > 0,我还没有找到解决这种情况的方法。

Creating DF (works fine)创建 DF(工作正常)

df = pd.DataFrame(
    {'Ticker': ['[BTC]','[ETH]','[LTC]','[BTC]','[ETH]',
              '[LTC]','[BTC]','[ETH]','[LTC]'],
     'Trade Results': [5,10,5,-5,-10,-5,5,10,5]}
)
Ticker股票代码 Trade Results交易结果
BTC比特币 5 5
ETH以太坊 10 10
LTC LTC 5 5
BTC比特币 -5 -5
ETH以太坊 -10 -10
LTC LTC -5 -5
BTC比特币 5 5
ETH以太坊 10 10
LTC LTC 5 5

Grouping Tickers and Getting Count (works fine)分组代码和计数(工作正常)

df_Grouped = df.groupby(['Ticker']).count()
Ticker股票代码 Count数数
BTC比特币 3 3
ETH以太坊 3 3
LTC LTC 3 3

Conditional Column (my problem)条件列(我的问题)

This is the part I haven't been able to figure out, my latest attempt is below but returns NaN for the profitable column.这是我无法弄清楚的部分,我最近的尝试如下,但为盈利列返回 NaN。

df_Grouped['Profitable'] = df.groupby(['Trade Result'] > 0).count()

Desired Output所需 Output

Ticker股票代码 Count数数 Profitable有利可图
BTC比特币 3 3 2 2
ETH以太坊 3 3 2 2
LTC LTC 3 3 2 2

You can do it like this:你可以这样做:

df_Grouped = df.groupby(['Ticker']).agg({'Trade Results': [('Count', 'count'), ('Profitable', lambda x: len(x[x>0]))]}).reset_index()

Output: Output:

                 Count Profitable
0  BTC             3          2
1  ETH             3          2
2  LTC             3          2

You can always pre-filter, however I like @David Ms answer您可以随时进行预过滤,但我喜欢@David Ms 的回答

df_Grouped['Profitable'] = df[df['Trade Results'] > 0].groupby(['Ticker']).count()

You could create a boolean for rows that are greater than 0, before aggregating on the groupby:在对 groupby 进行聚合之前,您可以为大于 0 的行创建 boolean:

(
    df.assign(gt_0=df["Trade Results"].gt(0))
    .groupby("Ticker")
    .agg(Count=("gt_0", "size"), Profitable=("gt_0", "sum"))
)

    Count   Profitable
Ticker      
BTC     3   2
ETH     3   2
LTC     3   2

You can also create a new df with groupBy and then merge it.您还可以使用 groupBy 创建一个新的 df 然后合并它。

df_Grouped = df.groupby(['Ticker']).count()
df_Grouped.reset_index(level = 'Ticker')
df_new = df[df['Trade Results'] >0].groupby(['Ticker']).count().reset_index(level = 'Ticker')
print(pd.merge(df_Grouped, df_new, left_on='Ticker', right_on='Ticker', how='left'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM