简体   繁体   English

热图以可视化值的百分比

[英]Heatmap to visualize percentage of values

I am looking for visualizing the results below, got by grouping my data by columns, using a heatmap.我正在寻找可视化下面的结果,通过使用热图按列对我的数据进行分组。

Data数据

    Classroom   Subject    Student
0   A   Mathematics         A.B.
1   B   Computer Science    G.M.
2   A   Computer Science    J.K.
3   B   Literature          S.R.
4   B   Computer Science    A.M.
5   A   Literature          S.R.
6   B   Mathematics         S.E.
7   C   Literature          S.T.
8   C   Mathematics         R.B.
9   A   Mathematics         B.K.

After grouping df.groupby(["Classroom", "Subject"]).size() , I have分组df.groupby(["Classroom", "Subject"]).size()后,我有

Classroom     Subject                    
A             Mathematics                 226
              Literature                  12
              Computer Science            122
B             Mathematics                 1
              Literature                  14
              Computer Science            19
              History                     22
              Geography                   238
C             Mathematics                 5
              Literature                  15
              

Seaborn would be probably the nicest solution for creating a heatmap and showing the percentage of the values ( .sum()/len(df))*100) , if I am right) based on what I have found on the Web.根据我在 Web 上找到的内容, Seaborn可能是创建热图并显示值百分比的最佳解决方案(如果我是对的,则为.sum()/len(df))*100) This solution Python - Get percentage based on column values is certainly helpful for my question, even if it does not use seaborn for visualization.此解决方案Python - 基于列值获取百分比当然对我的问题有帮助,即使它不使用 seaborn 进行可视化。 Doing this这样做

df.groupby(["Classroom", "Subject"]).size()/len(df)*100

I get the percentage of the values.我得到了值的百分比。 I would need also to plot these results using a heatmap.我还需要使用热图 plot 这些结果。 I would appreciated it if you can provide some help on this.如果您能对此提供一些帮助,我将不胜感激。

Seaborn's heatmap uses the columns and index of a dataframe. Seaborn 的热图使用 dataframe 的列和索引。 Pandas' pivot() and pivot_table() can create a suitable dataframe: Pandas 的pivot()pivot_table()可以创建一个合适的 dataframe:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df = pd.DataFrame(
    {'Classroom': np.random.choice(['A', 'B', 'C'], 1000),
     'Subject': np.random.choice(['Mathematics', 'Literature', 'Computer Science', 'History', 'Geography'], 1000),
     'Student': [''.join(np.random.choice([*'VWXYZ'], 7)) for _ in range(1000)]})
pivoted = pd.pivot_table(df, values='Student', index='Subject', columns='Classroom', aggfunc='count') / len(df) * 100

ax = sns.heatmap(data=pivoted, annot=True, fmt='.1f')
plt.tight_layout()
plt.show()

来自 pivot_table 的 sns.heatmap

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM