简体   繁体   English

如何创建一个计数器列来计算(外部)组中的(内部)组,在 python 中的每个(外部)组之后重置

[英]How to create a counter column that counts (inner)groups within an (outer)group, resetting after each (outer)group in python

Working with sales data, each row represents a product that has been sold.使用销售数据,每一行代表已售出的产品。 Each order can consist of either a single product (one row) or multiple products (multiple rows).每个订单可以包含单个产品(一行)或多个产品(多行)。 Each customer could have placed multiple orders throughout the dataset.每个客户都可以在整个数据集中下了多个订单。 I'm trying to implement a counter column where each new order would mean +1 to the counter, each product within that order should get the same counter value.我正在尝试实现一个计数器列,其中每个新订单都意味着计数器 +1,该订单中的每个产品都应该获得相同的计数器值。 With each customer the counter should start over.对于每个客户,柜台都应该重新开始。

html snippet of what the outcome should be because I'm not allowed to post screenshot: html 结果应该是什么片段,因为我不允许发布屏幕截图:

 <style type="text/css"> table.tableizer-table { font-size: 12px; border: 1px solid #CCC; font-family: Arial, Helvetica, sans-serif; }.tableizer-table td { padding: 4px; margin: 3px; border: 1px solid #CCC; }.tableizer-table th { background-color: #104E8B; color: #FFF; font-weight: bold; } </style> <table class="tableizer-table"> <thead><tr class="tableizer-firstrow"><th>Customer_ID</th><th>Order_ID</th><th>Date</th><th>Product_ID</th><th>Counter</th></tr></thead><tbody> <tr><td>56HS3F</td><td>3456HJ</td><td>16-04-2019</td><td>Product A</td><td>1</td></tr> <tr><td>56HS3F</td><td>3456HJ</td><td>16-04-2019</td><td>Product C</td><td>1</td></tr> <tr><td>56HS3F</td><td>1234QQ</td><td>25-05-2019</td><td>Product A</td><td>2</td></tr> <tr><td>56HS3F</td><td>3333HI</td><td>26-05-2019</td><td>Product B</td><td>3</td></tr> <tr><td>32AS88</td><td>1111SZ</td><td>20-12-2018</td><td>Product B</td><td>1</td></tr> <tr><td>32AS88</td><td>1111SZ</td><td>20-12-2018</td><td>Product A</td><td>1</td></tr> <tr><td>32AS88</td><td>2234KL</td><td>20-12-2018</td><td>Product C</td><td>2</td></tr> <tr><td>678HJI</td><td>6786ER</td><td>21-09-2019</td><td>Product C</td><td>1</td></tr> </tbody></table>

I have formed groups based on two categories: Customer_ID and Order_ID我根据两个类别组成了组:Customer_ID 和 Order_ID

I've tried working with ngroup() but this seems to ignore the outer group 'Customer_ID' and counts over the whole data frame looking for similar 'Order_ID's only.我尝试过使用 ngroup() 但这似乎忽略了外部组“Customer_ID”并计数整个数据框以寻找类似的“Order_ID”。 I've also tried with.cumcount() but this does respect my grouping and iterates within the nested 'Order_ID' group, but I want it to count over each Order_ID not within.我也尝试过 with.cumcount() 但这确实尊重我的分组并在嵌套的“Order_ID”组中进行迭代,但我希望它计算不在其中的每个 Order_ID。

data['Counter'] = data.groupby(['Customer_ID', 'Order_ID']).ngroup()

data['Counter'] = data.groupby(['Customer_ID', 'Order_ID']).cumcount()

Especially with.ngroup() i expected it to respect my group-within-group structure but it seems to disregard my 'Customer_ID' grouping.特别是 with.ngroup() 我希望它尊重我的组内组结构,但它似乎忽略了我的“Customer_ID”分组。

Update: Found the Answer更新:找到答案
I found my answer.我找到了我的答案。 I created a tracker to see if the Order_ID changed within each Customer_ID.我创建了一个跟踪器来查看每个 Customer_ID 中的 Order_ID 是否发生了变化。 Then I could use,cumsum(), grouping for Customer_ID.然后我可以使用cumsum() 对Customer_ID 进行分组。 on 'Order_Change' to count the 'True' values.在“Order_Change”上计算“真”值。

data['Order_Change'] = (data.Order_ID!=df.Order_ID.shift()) | (df.Customer_ID!=df.Customer_ID.shift())

data['Counter'] = df.groupby('Customer_ID')['Order_Change'].cumsum()

I found my answer.我找到了我的答案。 I created a tracker to see if the Order_ID changed within each Customer_ID.我创建了一个跟踪器来查看每个 Customer_ID 中的 Order_ID 是否发生了变化。 Then I could use,cumsum(), grouping for Customer_ID.然后我可以使用cumsum() 对Customer_ID 进行分组。 on 'Order_Change' to count the 'True' values.在“Order_Change”上计算“真”值。

data['Order_Change'] = (data.Order_ID!=df.Order_ID.shift()) | (df.Customer_ID!=df.Customer_ID.shift())

data['Counter'] = df.groupby('Customer_ID')['Order_Change'].cumsum()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM