简体   繁体   中英

Create matrix for heatmap from 2 data frame columns

I have a data frame that consists of 2 columns: measure 1 and measure 2. I have provided an example below. I would like to create a heatmap from the data. To effectively do this, I need to bin the values in each column. For measure 1, I want bin sizes of 0.1 and for measure 2 I want bin sizes of 0.2. I am able to assign the bins using the code below.

From this I think the next logical step is to create a count matrix based on the bin assignments for measure 1 and measure 2 and then plot the heatmap.

I have 2 questions:

1) How can I change the names of my bin assignments? Currently they start at 1. I would like to name the bins so the bin name reflects the maximum value in that bin, not just 1,2,3, etc.

2) How can I create a count matrix from the bin assignments?

I look forward to any suggestions. Thanks.

    #test dataframe
    hsim = matrix(rnorm(100 * 2, 1, 0.25), nrow=100, ncol=2, byrow=FALSE)
    colnames(hsim) = c("measure1", "measure2")
    hsim = as.data.frame(hsim)

    #bin measure 1 by bin size of 0.1
    FindBin.m1 = function(data){
      bin = seq(from=0.52, to=1.6, by=.1) #Specify the bins
      data$bin_index = findInterval(data$measure1, bin) #Determine which bin the value is in 
      }

    hsim$m1bin = FindBin.m1(hsim)

    #bin measure 2 by bin size of 0.2
    FindBin.m2 = function(data){
      bin = seq(from=0.4, to=1.6, by=.2) #Specify the bins
      data$bin_index = findInterval(data$measure2, bin) #Determine which bin the value is in 
      }

    hsim$m2bin = FindBin.m2(hsim)

    #how would I rename the bin indicies in the functions so that they reflect the max number in the bin?
    #for example, in FindBin.m1, bin index 1 represents 0.52 to 0.62. I want to name the bin 0.62 not 1

    #create a count matrix from the m1 and m2 bin assignments that can be used to plot a heatmap

    #plot heatmap
    heatmap(matrix.to.plot)

I figured out how to make the count matrix and also tried it as a data frame using ggplot. Here is the code that I ended up adding to the above.

    hsim2 = hsim[,3:4]
    hsim2.t = table(hsim2)
    #basic heatmap using the count matrix
    heatmap(hsim2.t)

    hsim2.t2 = as.data.frame(hsim2.t)
    #make a nicer looking heatmap
    ggplot(hsim2.t2, aes(m1bin, m2bin)) + geom_tile(aes(fill = Freq)) + scale_fill_gradient(low = "white",high = "steelblue")

That makes it good enough for me. I will figure out the renaming of bins. Hope this helps anyone else trying to do the same thing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM