[英]Python: Pivot table with groupby counting of category
Let's say I have a file that looks like:假设我有一个看起来像这样的文件:
+---------+---------+-------+
| Product | Quality | Origin|
+---------+---------+-------+
| Apple | Good | |
+---------+---------+-------+
| Apple | Bad | |
+---------+---------+-------+
| Apple | Bad | |
+---------+---------+-------+
| Orange | Good | |
+---------+---------+-------+
| . | | |
+---------+---------+-------+
| . | | |
+---------+---------+-------+
| Grape | Good | |
+---------+---------+-------+
I want to make a pivot result with counts:我想用计数制作一个 pivot 结果:
+---------+---------------+------+-----+
| Product | Total Number | Good | Bad |
+---------+---------------+------+-----+
| Apple | 5 | 3 | 2 |
+---------+---------------+------+-----+
| Orange | 8 | 5 | 3 |
+---------+---------------+------+-----+
| Grape | 3 | 1 | 2 |
+---------+---------------+------+-----+
| Total | 16 | 9 | 7 |
+---------+---------------+------+-----+
I am using groupby
and count
to get the total number:我正在使用
groupby
和count
来获取总数:
Total_Product = ProdcutFile.groupby('Product').count()
But I how I can make the result table contain Good and Bad counts?但是我怎样才能使结果表包含好和坏的计数?
Here is one way, using assign and pivot table.这是一种方法,使用分配和 pivot 表。 The assign statement makes a column of ones, and summing this up provides the counts in the final table.
assign 语句生成一列,并将其相加提供最终表中的计数。
from io import StringIO
import pandas as pd
data = '''Product Quality
Apple Good
Apple Bad
Apple Bad
Orange Good
Orange Bad
Grape Good
'''
df = (pd.read_csv(StringIO(data), sep='\s+', engine='python')
.assign(counter = 1)
.pivot_table(index='Product',
columns='Quality',
values='counter',
aggfunc=sum,
fill_value=0,
margins=True,
margins_name='Totals')
)
print(df)
Quality Bad Good Totals
Product
Apple 2 1 3
Grape 0 1 1
Orange 1 1 2
Totals 3 3 6
(Providing the columns names and ordering is straightforward and not shown.) (提供列名称和排序很简单,未显示。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.