[英]Python: Pivot table with groupby counting of category
假設我有一個看起來像這樣的文件:
+---------+---------+-------+
| Product | Quality | Origin|
+---------+---------+-------+
| Apple | Good | |
+---------+---------+-------+
| Apple | Bad | |
+---------+---------+-------+
| Apple | Bad | |
+---------+---------+-------+
| Orange | Good | |
+---------+---------+-------+
| . | | |
+---------+---------+-------+
| . | | |
+---------+---------+-------+
| Grape | Good | |
+---------+---------+-------+
我想用計數制作一個 pivot 結果:
+---------+---------------+------+-----+
| Product | Total Number | Good | Bad |
+---------+---------------+------+-----+
| Apple | 5 | 3 | 2 |
+---------+---------------+------+-----+
| Orange | 8 | 5 | 3 |
+---------+---------------+------+-----+
| Grape | 3 | 1 | 2 |
+---------+---------------+------+-----+
| Total | 16 | 9 | 7 |
+---------+---------------+------+-----+
我正在使用groupby
和count
來獲取總數:
Total_Product = ProdcutFile.groupby('Product').count()
但是我怎樣才能使結果表包含好和壞的計數?
這是一種方法,使用分配和 pivot 表。 assign 語句生成一列,並將其相加提供最終表中的計數。
from io import StringIO
import pandas as pd
data = '''Product Quality
Apple Good
Apple Bad
Apple Bad
Orange Good
Orange Bad
Grape Good
'''
df = (pd.read_csv(StringIO(data), sep='\s+', engine='python')
.assign(counter = 1)
.pivot_table(index='Product',
columns='Quality',
values='counter',
aggfunc=sum,
fill_value=0,
margins=True,
margins_name='Totals')
)
print(df)
Quality Bad Good Totals
Product
Apple 2 1 3
Grape 0 1 1
Orange 1 1 2
Totals 3 3 6
(提供列名稱和排序很簡單,未顯示。)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.