简体   繁体   English

如何使用pandas基于另一列[SoldDate]找到特定列[Model]的计数?

[英]How do I find the count of a particular column [Model], based on another column [SoldDate] using pandas?

I have a dataframe with 3 columns, such as SoldDate,Model and TotalSoldCount. 我有一个包含3列的数据框,例如SoldDate,Model和TotalSoldCount。 How do I create a new column, 'CountSoldbyMonth' which will give the count of each of the many models sold monthly? 如何创建一个新列'CountSoldbyMonth',它将列出每月销售的众多模型中的每一个? A screenshot describing the problem is given. 给出了描述该问题的屏幕截图。 The 'CountSoldbyMonth' should always be less than the 'TotalSoldCount'. 'CountSoldbyMonth'应始终小于'TotalSoldCount'。

I am new to Python. 我是Python的新手。 enter image description here 在此输入图像描述

Date        Model  TotalSoldCount
Jan 19        A          4
Jan 19        A          4
Jan 19        A          4
Jan 19        B          6
Jan 19        C          2
Jan 19        C          2
Feb 19        A          4
Feb 19        B          6
Feb 19        B          6
Feb 19        B          6
Mar 19        B          6
Mar 19        B          6

The new df should look like this. 新的df应该是这样的。

Date      Model     TotalSoldCount     CountSoldbyMonth
Jan 19     A               4                    3
Jan 19     A               4                    3
Jan 19     A               4                    3
Jan 19     B               6                    1
Jan 19     C               2                    2
Jan 19     C               2                    2
Feb 19     A               4                    1
Feb 19     B               6                    3
Feb 19     B               6                    3
Feb 19     B               6                    3
Mar 19     B               6                    2
Mar 19     B               6                    2

I tried doing 我试过了

df['CountSoldbyMonth'] = df.groupby(['date','model']).totalsoldcount.transform('sum') df ['CountSoldbyMonth'] = df.groupby(['date','model'])。totalsoldcount.transform('sum')

but it is generating a different value. 但它产生了不同的价值。

it's easier to help if you give code that let's the user experiment. 如果您提供允许用户进行实验的代码,则更容易提供帮助。 In this case, I'd think taking your dataframe (df) & doing the following should work: 在这种情况下,我认为采用您的数据帧(df)并执行以下操作应该有效:

df['CountSoldbyMonth'] = df.groupby(['Date','Model'])['TotalSoldCount'].transform('sum')

Suppose you have this data set: 假设您有这个数据集:

      date model  totalsoldcount
0   Jan 19     A             110
1   Jan 19     A             110
2   Jan 19     A             110
3   Jan 19     B              50
4   Jan 19     C              70
5   Jan 19     C              70
6   Feb 19     A             110
7   Feb 19     B              50
8   Feb 19     B              50
9   Feb 19     B              50
10  Mar 19     B              50
11  Mar 19     B              50

And you want to define a new column, countsoldbymonth . 并且您想要定义一个新列, countsoldbymonth You can groupby the date and model columns and then sum the totalsoldcount with a transform and then create the new column: 您可以groupbydatemodel列,然后sumtotalsoldcount与变换,然后创建新列:

s['countsoldbymonth'] = s.groupby([
    'date',
    'model'
]).totalsoldcount.transform('sum')

print(s)

      date model  totalsoldcount  countsoldbymonth
0   Jan 19     A             110               330
1   Jan 19     A             110               330
2   Jan 19     A             110               330
3   Jan 19     B              50                50
4   Jan 19     C              70               140
5   Jan 19     C              70               140
6   Feb 19     A             110               110
7   Feb 19     B              50               150
8   Feb 19     B              50               150
9   Feb 19     B              50               150
10  Mar 19     B              50               100
11  Mar 19     B              50               100

Or, if you just want to see the sums without creating a new column you can use sum instead of transform like this: 或者,如果您只想在不创建新列的情况下查看总和,则可以使用sum而不是像这样的transform

print(s.groupby([
    'date',
    'model'
]).totalsoldcount.sum())

date    model
Feb 19  A        110
        B        150
Jan 19  A        330
        B         50
        C        140
Mar 19  B        100

Edit 编辑

If you just want to know how many sales were done in the month you can do the same groupby , but instead of sum use count 如果您只是想知道当月完成了多少次销售,您可以使用相同的groupby ,而不是sum使用count

df['CountSoldByMonth'] = df.groupby([
    'Date',
    'Model'
]).TotalSoldCount.transform('count')

print(df)

      Date Model  TotalSoldCount  CountSoldByMonth
0   Jan 19     A               4                 3
1   Jan 19     A               4                 3
2   Jan 19     A               4                 3
3   Jan 19     B               6                 1
4   Jan 19     C               2                 2
5   Jan 19     C               2                 2
6   Feb 19     A               4                 1
7   Feb 19     B               6                 3
8   Feb 19     B               6                 3
9   Feb 19     B               6                 3
10  Mar 19     B               6                 2
11  Mar 19     B               6                 2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据另一列的日期条件获取熊猫数据框中特定列的值? - How do I get the values of a particular column in a pandas dataframe based on a date condition on another column? 如何根据 python 中另一列的条件查找两个日期之间特定列的最大值 - How do I Find max value of a particular column between 2 dates based on a condition from another column in python 使用 Pandas 根据数据框中的另一列值获取特定值的计数和总数 - Get count of particular values and the total based on another column value in dataframe using Pandas 如何使用pandas基于另一列查找数量前3个值 - How to find top 3 values in amount, based on another column by using pandas 使用 pandas,如何检查列中是否存在特定序列? - Using pandas, how do I check if a particular sequence exist in a column? 如何使用 python pandas 根据条件创建“计数”列? - How do i create a 'count' column based on condition with using python pandas? 如何使用 pandas 根据第三列中的值创建一个包含来自一列或另一列的值的新列? - How do I create a new column with values from one column or another based on the value in a third column using pandas? 我如何根据 pandas 中特定列的用户输入获得列的总和 - How do i get the sum of a column based on the user input of that particular column in pandas 熊猫:如何通过特定列值的值获取行计数,以及如何将计数添加为另一列。 - Pandas: How to get a row count by the value of a particular column value, and add the count as another column. Pandas - 如何根据另一列的条件找到 1 列的前 n 个元素 - Pandas - How do you find the top n elements of 1 column based on a condition from another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM