简体   繁体   English

x 轴上的 Seaborn 热图间距

[英]Seaborn heatmap spacing on x axis

I am currently using seaborn.heatmap() to display binary data that I have organized in a pandas.DataFrame.我目前正在使用 seaborn.heatmap() 来显示我在 pandas.DataFrame 中组织的二进制数据。 The index of the DataFrame is discrete and corresponds to different locations, while the columns are continuous and represent time. DataFrame的索引是离散的,对应不同的位置,而列是连续的,代表时间。 How can I make the x Axis in the heatmap to have a correct spacing between the measurement values?如何使热图中的 x 轴在测量值之间具有正确的间距?

To be more precise, I want the difference between 0 and 1'000 to be 1'000 times bigger than between 0 and 1 and 10'000 times the difference between 1 and 1.1.更准确地说,我希望 0 和 1'000 之间的差异比 0 和 1 之间的差异大 1'000 倍,以及 1 和 1.1 之间的差异大 10'000 倍。 Here is a minimal of how my data is organised:以下是我的数据组织方式的最低限度:

import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df=pd.DataFrame(np.random.randint(0,2,size=(5, 8)), columns=[1,1.1,2,3,4,1001,1002,1003], index=['A','B','C','D','E'])
sns.heatmap(df,cmap='binary', square=True)

The resulting image looks like this: https://i.stack.imgur.com/uxSrH.png生成的图像如下所示: https : //i.stack.imgur.com/uxSrH.png

The data between the measurements (eg. for measurement value 500, which is not part of the DataFrame should be 0. I do not mind giving up the square=True.测量值之间的数据(例如,对于不属于 DataFrame 的测量值 500 应为 0。我不介意放弃 square=True。

For those of you wondering, the 0/1 are False/True statements that indicate whether or not I made a measurement at this sampling site at this location at a given time.对于那些想知道的人,0/1 是 False/True 陈述,表明我是否在给定时间在此位置的此采样点进行了测量。

Thank you so much非常感谢

You could use plt.pcolor() , which creates an unevenly-spaced grid, with the gridlines provided by its first and second parameter.您可以使用plt.pcolor() ,它创建一个不均匀间隔的网格,网格线由其第一个和第二个参数提供。 As a 5x8 grid of cells needs 6x9 grid lines, both the list of x-values and of y-values needs to be extended by one.由于 5x8 单元格网格需要 6x9 网格线,因此 x 值和 y 值列表都需要扩展 1。

The example uses 101 instead of 1001 , because a factor of 1000 difference would make everything pulled together to a thin line, except the area between 4 and 1001.该示例使用101而不是1001 ,因为 1000 的因子差异将使所有东西都拉到一条细线上,除了 4 和 1001 之间的区域。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# slightly modified example data
df = pd.DataFrame(np.random.randint(0, 2, size=(5, 8)), columns=[1, 1.1, 2, 3, 4, 101, 102, 103],
                  index=['A', 'B', 'C', 'D', 'E'])
plt.pcolor(list(df.columns) + [2 * df.columns[-1] - df.columns[-2]],
           np.arange(len(df.index)+1),
           df.values, cmap='binary')
plt.yticks(np.arange(0.5, len(df.index)), df.index) # labels between the grid lines
plt.show()

plt.pcolor 创建网格

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM