简体   繁体   English

用分类数据绘制热图

[英]plotting a heatmap with categorical data

Is heat map possible with categorical data like this: 像这样的分类数据是否可能产生热图:

so I want bins in y axis, year in x axis and Firm as values. 所以我想将y轴的垃圾箱,x轴的年垃圾箱和Firms作为值。 Is this something possible? 这有可能吗? if so how to do in python. 如果是这样,如何在python中做。

    Firm  year  bins
0     A  1998  binA
1     A  2000  binB
2     A  1999  binA
3     B  1998  binA
4     B  2000  binE
5     B  1999  binA
6     C  1998  binA
7     C  2000  binE
8     C  1999  binA
9     D  1998  binA
10    D  2000  binA
11    D  1999  binB
12    E  1998  binB
13    E  2000  binA
14    E  1999  binB
15    F  1998  binB
16    F  2000  binC
17    F  1999  binH
18    G  1998  binB
19    G  2000  binE
20    G  1999  binF
21    H  1998  binB
22    H  2000  binA
23    H  1999  binF
24    I  1998  binB
25    I  2000  binF
26    I  1999  binF
27    J  1998  binC
28    J  2000  binA
29    J  1999  binF
30    K  1998  binD
31    K  2000  binE
32    K  1999  binA
33    L  1998  binE
34    L  2000  binH
35    L  1999  binC
36    M  1998  binE
37    M  2000  binH
38    M  1999  binH

One solution with seaborn I tried did not work 我尝试过的Seaborn的一种解决方案不起作用

import seaborn as sns

df=pd.pivot(df7['Firm'],df7['year'], df7['bins'])

ax = sns.heatmap(df) 

R has following example: Heatmap of categorical variable counts R有以下示例: 分类变量计数的热图

Using R and following code I have tentatively been able to construct the heatmap with above example: 使用R和以下代码,我已经能够通过上述示例初步构建热图:

library(magrittr)
library(dplyr)
m<- read.csv("~/df55testR.csv",
             stringsAsFactors=FALSE, header=T)  
m<-m%>%select(2:6)

ml <- reshape2::melt(data = m, id.vars="Firm", variable.name = "year", value.name="bin")  
ml

ml$Test_Gr <- apply(ml[,2:3], 1, paste0, collapse="_")   
mw <- reshape2::dcast(ml, Firm ~ bin, fun.aggregate = length)

mwm<-as.matrix(mw[,-1])
mwm

mcm <- t(mwm) %*% mwm

colnames(mcm) <- colnames(mw)[-1]
rownames(wc) <- colnames(xw)[-1]
gplots::heatmap.2(mcm, trace="none", col = rev(heat.colors(15)))

在此处输入图片说明

you can try groupby with nunique 您可以尝试使用nunique groupby

grouped = df.groupby(['year','bins']).nunique()['Firm'].reset_index([0,1])
piv_grouped = grouped.pivot(index='bins', columns='year', values='Firm')
sns.heatmap(piv_grouped, cmap='RdYlGn_r', linewidths=0.5, annot=True)

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM