GroupBy自定义lambda function（使用字符串）在pandas

Question

I have the following DF我有以下DF

  Sku  Availability
0   1  out of stock
1   1      in stock
2   1      in stock
3   2  out of stock

How can I use a custom aggregate function to create the following DF:如何使用自定义聚合 function 创建以下 DF：

  Sku  Availability
0   1      in stock
2   2  out of stock

(Basically, if a SKU is in stock, the out of stock SKUs should be dropped, I have same SKUs because each refers to a different store...) （基本上，如果一个 SKU 有货，缺货的 SKU 应该被丢弃，我有相同的 SKU，因为每个都指的是不同的商店......）

MVCE: MVCE：

d = {'Sku': ['1', '1', '1', '2'], 'Availability': ['out of stock', 'in stock', 'in stock', 'out of stock']}
df = pd.DataFrame(data=d)
# df = df.groupby('Sku').apply(lambda x: ...)

Answer 1

You can use sort_values to sort lexicographically your data by Availabilility then drop_duplicates (keep first row by Sku )您可以使用sort_values按Availabilility按字典顺序对数据进行排序，然后drop_duplicates （按Sku保留第一行）

out = df.sort_values(['Sku', 'Availability']) \
        .drop_duplicates('Sku', ignore_index=True)
print(out)

# Output:
  Sku  Availability
0   1      in stock
1   2  out of stock

A more consistent way is to use CategoricalDtype :一种更一致的方法是使用CategoricalDtype ：

# Explicit is better than implicit
cat = pd.CategoricalDtype(['in stock', 'out of stock'], ordered=True)
out = df.astype({'Availability': cat}).sort_values(['Sku', 'Availability']) \
        .drop_duplicates('Sku', ignore_index=True)

GroupBy自定义lambda function（使用字符串）在pandas

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-03-26 22:58:18

GroupBy自定义lambda function（使用字符串）在pandas

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-03-26 22:58:18

解决方案1
2 已采纳 2022-03-26 22:58:18