简体   繁体   English

从2个数据框列创建热图矩阵

[英]Create matrix for heatmap from 2 data frame columns

I have a data frame that consists of 2 columns: measure 1 and measure 2. I have provided an example below. 我有一个包含两列的数据框:度量1和度量2。下面提供了一个示例。 I would like to create a heatmap from the data. 我想根据数据创建一个热图。 To effectively do this, I need to bin the values in each column. 为了有效地做到这一点,我需要对每一列中的值进行装箱。 For measure 1, I want bin sizes of 0.1 and for measure 2 I want bin sizes of 0.2. 对于小节1,我希望箱大小为0.1,对于小节2,我希望箱大小为0.2。 I am able to assign the bins using the code below. 我可以使用下面的代码分配垃圾箱。

From this I think the next logical step is to create a count matrix based on the bin assignments for measure 1 and measure 2 and then plot the heatmap. 因此,我认为下一步的逻辑步骤是根据度量1和度量2的bin分配创建一个计数矩阵,然后绘制热图。

I have 2 questions: 我有两个问题:

1) How can I change the names of my bin assignments? 1)如何更改我的箱分配的名称? Currently they start at 1. I would like to name the bins so the bin name reflects the maximum value in that bin, not just 1,2,3, etc. 当前它们从1开始。我想命名这些bin,以便bin名称反映该bin中的最大值,而不仅仅是1,2,3等。

2) How can I create a count matrix from the bin assignments? 2)如何从箱分配创建计数矩阵?

I look forward to any suggestions. 我期待任何建议。 Thanks. 谢谢。

    #test dataframe
    hsim = matrix(rnorm(100 * 2, 1, 0.25), nrow=100, ncol=2, byrow=FALSE)
    colnames(hsim) = c("measure1", "measure2")
    hsim = as.data.frame(hsim)

    #bin measure 1 by bin size of 0.1
    FindBin.m1 = function(data){
      bin = seq(from=0.52, to=1.6, by=.1) #Specify the bins
      data$bin_index = findInterval(data$measure1, bin) #Determine which bin the value is in 
      }

    hsim$m1bin = FindBin.m1(hsim)

    #bin measure 2 by bin size of 0.2
    FindBin.m2 = function(data){
      bin = seq(from=0.4, to=1.6, by=.2) #Specify the bins
      data$bin_index = findInterval(data$measure2, bin) #Determine which bin the value is in 
      }

    hsim$m2bin = FindBin.m2(hsim)

    #how would I rename the bin indicies in the functions so that they reflect the max number in the bin?
    #for example, in FindBin.m1, bin index 1 represents 0.52 to 0.62. I want to name the bin 0.62 not 1

    #create a count matrix from the m1 and m2 bin assignments that can be used to plot a heatmap

    #plot heatmap
    heatmap(matrix.to.plot)

I figured out how to make the count matrix and also tried it as a data frame using ggplot. 我想出了如何制作计数矩阵,并使用ggplot将其作为数据帧进行了尝试。 Here is the code that I ended up adding to the above. 这是我最终添加到上面的代码。

    hsim2 = hsim[,3:4]
    hsim2.t = table(hsim2)
    #basic heatmap using the count matrix
    heatmap(hsim2.t)

    hsim2.t2 = as.data.frame(hsim2.t)
    #make a nicer looking heatmap
    ggplot(hsim2.t2, aes(m1bin, m2bin)) + geom_tile(aes(fill = Freq)) + scale_fill_gradient(low = "white",high = "steelblue")

That makes it good enough for me. 这对我来说足够好了。 I will figure out the renaming of bins. 我会弄清楚垃圾桶的重命名。 Hope this helps anyone else trying to do the same thing. 希望这对尝试做同一件事的其他人有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM