I am looking for visualizing the results below, got by grouping my data by columns, using a heatmap.
Data
Classroom Subject Student
0 A Mathematics A.B.
1 B Computer Science G.M.
2 A Computer Science J.K.
3 B Literature S.R.
4 B Computer Science A.M.
5 A Literature S.R.
6 B Mathematics S.E.
7 C Literature S.T.
8 C Mathematics R.B.
9 A Mathematics B.K.
After grouping df.groupby(["Classroom", "Subject"]).size()
, I have
Classroom Subject
A Mathematics 226
Literature 12
Computer Science 122
B Mathematics 1
Literature 14
Computer Science 19
History 22
Geography 238
C Mathematics 5
Literature 15
Seaborn
would be probably the nicest solution for creating a heatmap and showing the percentage of the values ( .sum()/len(df))*100)
, if I am right) based on what I have found on the Web. This solution Python - Get percentage based on column values is certainly helpful for my question, even if it does not use seaborn for visualization. Doing this
df.groupby(["Classroom", "Subject"]).size()/len(df)*100
I get the percentage of the values. I would need also to plot these results using a heatmap. I would appreciated it if you can provide some help on this.
Seaborn's heatmap uses the columns and index of a dataframe. Pandas' pivot()
and pivot_table()
can create a suitable dataframe:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
df = pd.DataFrame(
{'Classroom': np.random.choice(['A', 'B', 'C'], 1000),
'Subject': np.random.choice(['Mathematics', 'Literature', 'Computer Science', 'History', 'Geography'], 1000),
'Student': [''.join(np.random.choice([*'VWXYZ'], 7)) for _ in range(1000)]})
pivoted = pd.pivot_table(df, values='Student', index='Subject', columns='Classroom', aggfunc='count') / len(df) * 100
ax = sns.heatmap(data=pivoted, annot=True, fmt='.1f')
plt.tight_layout()
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.