I want to make heatmap in R but I could not change the dataframe to appropriate matrix form. I have a dataframe with three columns (protein,value and treatments) in columns proteins and treatments I have some repeatation. could you someone help me how can I make appropriate matrix from this data and which package is better to make heatmap.(I have 1000 proteins (some of them are repeated according to various treatments) and 4 group of treatments). I am beginner in R and really need your help. Thank you in advance.
example:
protein value treatment
EPN1 0.986 treat1
LAMB1 0.881 treat2
PKP4 0.827 treat2
PKP2 0.739 treat3
BAIAP2 0.519 treat2
UTRN 0.502 treat4
REPS2 0.481 treat2
PKP4 0.365 treat1
LAMC1 -0.529 treat2
PPIB 2.86 treat4
You may do either of these 2
df <- read.table(header = T, text = "protein value treatment
EPN1 0.986 treat1
LAMB1 0.881 treat2
PKP4 0.827 treat2
PKP2 0.739 treat3
BAIAP2 0.519 treat2
UTRN 0.502 treat4
REPS2 0.481 treat2
PKP4 0.365 treat1
LAMC1 -0.529 treat2
PPIB 2.86 treat4 ")
library(tidyverse)
df %>% ggplot(aes(x= treatment, y = protein, fill = value)) +
geom_tile()
OR
library(echarts4r)
df |>
e_charts(protein) |>
e_heatmap(treatment, value) |>
e_visual_map(value)
Here is some example data.
dat <- data.frame(
protein=replicate(100, paste(sample(LETTERS, 4), collapse="")),
value=rnorm(100),
treatment=paste0("treat", sample(1:4, 100, replace=TRUE)),
stringsAsFactors=FALSE
)
Using ggplot2
you could do
library(ggplot2)
plt <- ggplot(dat, aes(treatment, protein, fill=value)) + geom_tile()
More options you can find here: https://www.r-graph-gallery.com/79-levelplot-with-ggplot2.html
However, I don't know how to deal with a lot of proteins to plot (as you mentioned). Do you need to see the names of the proteins?
EDIT: one possibility for 1000 proteins would be to make the chart really long, like so:
ggsave(
"long.pdf", plot=plt, device="pdf",
width=21, height=150, units="cm", limitsize=FALSE
)
This creates a PDF in the current folder. Using the zoom function of your PDF-Viewer, you can then navigate to the rows of interest.
EDIT 2: For more complex charts I (still) rely on base R. But maybe there are some ggplot-Style packages I am not aware of. A base R solution requires to convert the data into a matrix first. One approach would be to use a sparse matrix like this:
dim_x <- unique(dat$protein)
dim_y <- unique(dat$treatment)
map_x <- setNames(seq_along(dim_x), dim_x)
map_y <- setNames(seq_along(dim_y), dim_y)
library(Matrix)
mat <- sparseMatrix(
i=map_x[dat$protein], j=map_y[dat$treatment], x=dat$value,
dims=c(length(dim_x), length(dim_y)), dimnames=list(dim_x, dim_y)
)
Then you can use the base R heatmap
function,
heatmap(as.matrix(mat))
or some more customizeable function like
library(pheatmap)
pheatmap(mat)
which both show dendograms.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.