簡體   English   中英

ggplot 熱圖網格線格式 geom_tile 和 geom_rect

[英]ggplot heatmap gridline formatting geom_tile and geom_rect

幾天來,我一直致力於創建熱圖,但我無法讓網格線的最終格式正常工作。 請參閱下面的代碼和附圖。 我要做的是使用 geom_tile() 沿着熱圖的圖塊對齊網格線,以便每個圖塊以框的方式填充網格的內部。 我能夠使用 geom_raster() 對齊網格線,但是 y 軸 label 在瓷磚的頂部或底部打勾,但我需要它在中心打勾(見紅色突出顯示),我也無法讓 geom_raster 換行瓷磚周圍的白線邊框,因此顏色塊在我的原始數據集中看起來有點雜亂無章。 對於格式化代碼的任何幫助將不勝感激。 非常感謝!

#The data set in long format 


y<- c("A","A","A","A","B","B","B","B","B","C","C","C","D","D","D")
    x<- c("2020-03-01","2020-03-15","2020-03-18","2020-03-18","2020-03-01","2020-03-01","2020-03-01","2020-03-01","2020-03-05","2020-03-06","2020-03-05","2020-03-05","2020-03-20","2020-03-20","2020-03-21")
    v<-data.frame(y,x)

#approach 1 using geom_tile but gridline does not align with borders of the tiles 
    v%>%
      count(y,x,drop=FALSE)%>%
      arrange(n)%>%
      ggplot(aes(x=x,y=fct_reorder(y,n,sum)))+
      geom_tile(aes(fill=n),color="white", size=0.25)

需要平鋪邊框與網格線對齊

我曾嘗試從另一篇文章運行類似的代碼,但我無法讓它正常運行。 我認為因為我的 x 變量是 y 變量的計數變量,所以不能格式化為因子變量以在 geom_rect() 中指定 xmin 和 xmax

#approach 2 using geom_raster but y-axis label can't tick at the center of tiles and there's no border around the tile to differentiate between tiles. 

v%>%
  count(y,x,drop=FALSE)%>%
  arrange(n)%>%
  ggplot()+
  geom_raster(aes(x=x,y=fct_reorder(y,n,sum),fill=n),hjust=0,vjust=0)

需要 y 軸標簽在瓷磚中心打勾,並且需要在瓷磚周圍設置邊框

我認為保持刻度線和網格線在它們所在的位置是有意義的。 為了仍然實現您正在尋找的內容,我建議您擴展數據以包含所有可能的組合,並將na.value設置為中性填充顏色:

# all possible combinations
all <- v %>% expand(y, x)

# join with all, n will be NA for obs. in all that are not present in v
v = v %>% group_by_at(vars(y, x)) %>% 
    summarize(n = n()) %>% right_join(all)

ggplot(data = v, 
       aes(x=x, y=fct_reorder(y,n, function(x) sum(x, na.rm = T))))+ # note that you must account for the NA values now 
geom_tile(aes(fill=n), color="white",
        size=0.25) +
scale_fill_continuous(na.value = 'grey90') +
scale_x_discrete(expand = c(0,0)) +
scale_y_discrete(expand = c(0,0))

這有點駭人聽聞。 我的方法將分類變量轉換為數字,從而將次要網格線添加到與圖塊對齊的 plot 中。 為了擺脫主要的網格線,我只需使用theme() 缺點:必須手動設置中斷和標簽。

library(ggplot2)
library(dplyr)
library(forcats)

v1 <- v %>%
  count(y,x,drop=FALSE)%>%
  arrange(n) %>%
  mutate(y = fct_reorder(y, n, sum),
         y1 = as.integer(y),
         x = factor(x),
         x1 = as.integer(x))

labels_y <- levels(v1$y)
breaks_y <- seq_along(labels_y)

labels_x <- levels(v1$x)
breaks_x <- seq_along(labels_x)

ggplot(v1, aes(x=x1, y=y1))+
  geom_tile(aes(fill=n), color="white", size=0.25) + 
  scale_y_continuous(breaks = breaks_y, labels = labels_y) +
  scale_x_continuous(breaks = breaks_x, labels = labels_x) +
  theme(panel.grid.major = element_blank())

代表 package (v0.3.0) 於 2020 年 5 月 23 日創建

編輯:檢查長變量名稱

y<- c("John Doe","John Doe","John Doe","John Doe","Mary Jane","Mary Jane","Mary Jane","Mary Jane","Mary Jane","C","C","C","D","D","D")
x<- c("2020-03-01","2020-03-15","2020-03-18","2020-03-18","2020-03-01","2020-03-01","2020-03-01","2020-03-01","2020-03-05","2020-03-06","2020-03-05","2020-03-05","2020-03-20","2020-03-20","2020-03-21")
v<-data.frame(y,x)

代表 package (v0.3.0) 於 2020 年 5 月 23 日創建

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM