[英]Seaborn heatmap spacing on x axis
I am currently using seaborn.heatmap() to display binary data that I have organized in a pandas.DataFrame.我目前正在使用 seaborn.heatmap() 来显示我在 pandas.DataFrame 中组织的二进制数据。 The index of the DataFrame is discrete and corresponds to different locations, while the columns are continuous and represent time.
DataFrame的索引是离散的,对应不同的位置,而列是连续的,代表时间。 How can I make the x Axis in the heatmap to have a correct spacing between the measurement values?
如何使热图中的 x 轴在测量值之间具有正确的间距?
To be more precise, I want the difference between 0 and 1'000 to be 1'000 times bigger than between 0 and 1 and 10'000 times the difference between 1 and 1.1.更准确地说,我希望 0 和 1'000 之间的差异比 0 和 1 之间的差异大 1'000 倍,以及 1 和 1.1 之间的差异大 10'000 倍。 Here is a minimal of how my data is organised:
以下是我的数据组织方式的最低限度:
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df=pd.DataFrame(np.random.randint(0,2,size=(5, 8)), columns=[1,1.1,2,3,4,1001,1002,1003], index=['A','B','C','D','E'])
sns.heatmap(df,cmap='binary', square=True)
The resulting image looks like this: https://i.stack.imgur.com/uxSrH.png生成的图像如下所示: https : //i.stack.imgur.com/uxSrH.png
The data between the measurements (eg. for measurement value 500, which is not part of the DataFrame should be 0. I do not mind giving up the square=True.测量值之间的数据(例如,对于不属于 DataFrame 的测量值 500 应为 0。我不介意放弃 square=True。
For those of you wondering, the 0/1 are False/True statements that indicate whether or not I made a measurement at this sampling site at this location at a given time.对于那些想知道的人,0/1 是 False/True 陈述,表明我是否在给定时间在此位置的此采样点进行了测量。
Thank you so much非常感谢
You could use plt.pcolor()
, which creates an unevenly-spaced grid, with the gridlines provided by its first and second parameter.您可以使用
plt.pcolor()
,它创建一个不均匀间隔的网格,网格线由其第一个和第二个参数提供。 As a 5x8 grid of cells needs 6x9 grid lines, both the list of x-values and of y-values needs to be extended by one.由于 5x8 单元格网格需要 6x9 网格线,因此 x 值和 y 值列表都需要扩展 1。
The example uses 101
instead of 1001
, because a factor of 1000 difference would make everything pulled together to a thin line, except the area between 4 and 1001.该示例使用
101
而不是1001
,因为 1000 的因子差异将使所有东西都拉到一条细线上,除了 4 和 1001 之间的区域。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# slightly modified example data
df = pd.DataFrame(np.random.randint(0, 2, size=(5, 8)), columns=[1, 1.1, 2, 3, 4, 101, 102, 103],
index=['A', 'B', 'C', 'D', 'E'])
plt.pcolor(list(df.columns) + [2 * df.columns[-1] - df.columns[-2]],
np.arange(len(df.index)+1),
df.values, cmap='binary')
plt.yticks(np.arange(0.5, len(df.index)), df.index) # labels between the grid lines
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.