[英]How to make a 2D Histogram/Heatmap of (string) label data in Python?
I have a large dataset of certain events for the my research industry, organized in a dataframe as follows.我有一个针对我的研究行业的某些事件的大型数据集,组织在 dataframe 中,如下所示。 Each event has an event type (str), a year of the event (int), event size (int) and an event location (str).每个事件都有事件类型 (str)、事件年份 (int)、事件大小 (int) 和事件位置 (str)。
An example dataframe is structured below, with event types 'A', 'B', 'C', or 'D' and event locations 'CA', 'TX', 'NY'.示例 dataframe 的结构如下所示,事件类型为“A”、“B”、“C”或“D”,事件位置为“CA”、“TX”、“NY”。
Event Number事件编号 | Event Type事件类型 | Year年 | Size尺寸 | Location地点 |
---|---|---|---|---|
1 1 | A一个 | 2014 2014 | 1000 1000 | CA加州 |
2 2 | B乙 | 2014 2014 | 1000 1000 | TX德克萨斯州 |
3 3 | C C | 2014 2014 | 456 456 | CA加州 |
4 4 | C C | 2014 2014 | 675 675 | NY纽约 |
5 5 | B乙 | 2014 2014 | 567 567 | TX德克萨斯州 |
6 6 | A一个 | 2014 2014 | 765 765 | CA加州 |
7 7 | C C | 2014 2014 | 1000 1000 | NY纽约 |
8 8 | B乙 | 2014 2014 | 675 675 | TX德克萨斯州 |
9 9 | D D | 2015 2015 | 3424 3424 | NY纽约 |
10 10 | A一个 | 2015 2015 | 567 567 | TX德克萨斯州 |
11 11 | A一个 | 2015 2015 | 435 435 | CA加州 |
12 12 | C C | 2016 2016 年 | 45 45 | CA加州 |
Now, I want to plot a heatmap of event type vs year.现在,我想 plot 事件类型与年份的热图。 ie, a heatmap with year on the x axis, event type on the y-axis, and a heat color representing a count of how many of those types of events happened in that year.即,x 轴为年份、y 轴为事件类型的热图,以及表示该年发生了多少此类事件的计数的热颜色。 The resulting matrix for the above table would look something like this:上表的结果矩阵如下所示:
Event Type事件类型 | 2014 2014 | 2015 2015 | 2016 2016 年 |
---|---|---|---|
A一个 | 2 2 | 2 2 | 0 0 |
B乙 | 3 3 | 0 0 | 0 0 |
C C | 3 3 | 0 0 | 1 1 |
D D | 0 0 | 1 1 | 0 0 |
I have looked into using seaborn but I am not sure how to approach this sort of 2D histogram.我已经研究过使用 seaborn 但我不确定如何处理这种二维直方图。
How would I go about it if I also wanted to plot a heatmap of location vs event type (2 strings)?如果我还想 plot 位置与事件类型(2 个字符串)的热图,我将如何 go 呢?
Thanks!谢谢!
seaborn.histplot
can produce a bivariate plot and understand categorical variables, so: seaborn.histplot
可以生成二元 plot 并理解分类变量,因此:
df = pd.read_clipboard()
ax = sns.histplot(data=df, x="Event Type", y="Location", cbar=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.