[英]Pandas Unique Values as Columns with Counts
使用 pandas 數據框並嘗試在分組輸出中翻轉它,該輸出采用唯一值並將它們作為一列,並將每個對應的計數作為新數據框中的值。
這是起始數據框:
df = pd.DataFrame([('gold', 'bronze', 'silver'),
('silver', 'gold', 'bronze'),
('gold', 'silver', 'bronze'),
('bronze', 'silver', 'gold')],
columns=('Canada', 'China', 'South Korea'))
df.head()
Canada China South Korea
0 gold bronze silver
1 silver gold bronze
2 gold silver bronze
3 bronze silver gold
所需的輸出是這樣的:
nation gold silver bronze
0 Canada 2 1 1
1 China 1 2 1
2 South Korea 1 1 2
您可以將df.apply
與pd.value_counts
df.apply
使用*
df.apply(pd.value_counts).T
bronze gold silver
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
* 我沒有找到pd.value_counts
文檔,因此,將 github 鏈接鏈接到該函數。
編輯:在閱讀源代碼pd.Series.value_counts
只是調用pd.value_counts
使用pd.get_dummies
和sum
pd.get_dummies(df.T, prefix='',prefix_sep='').sum(level=0,axis=1)
Out[995]:
bronze gold silver
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
w = df.melt()
variable value
0 Canada gold
1 Canada silver
2 Canada gold
3 Canada bronze
4 China bronze
5 China gold
6 China silver
7 China silver
8 South Korea silver
9 South Korea bronze
10 South Korea bronze
11 South Korea gold
進而:
pd.crosstab(w['variable'],w['value'])
想要的結果:
value bronze gold silver
variable
Canada 1 2 1
China 1 1 2
South Korea 2 1 1
df = pd.DataFrame([('gold', 'bronze', 'silver'),
('silver', 'gold', 'bronze'),
('gold', 'silver', 'bronze'),
('bronze', 'silver', 'gold')],
columns=('Canada', 'China', 'South Korea')).transpose()
df.apply(pd.value_counts,axis=1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.