簡體   English   中英

R堆疊的條形圖,包括“其他”(使用ggplot2)

[英]R stacked bar charts including “other” (using ggplot2)

我想制作一個堆疊的條形圖,描述三個不同季節中兩個位置的分類單元數量。 我正在使用ggplot2。 可以進行繪圖,但是我有48個分類單元,所以在酒吧中最終得到很多不同的顏色。 只有八個分類單元頻繁且大量出現,因此我想將其他分類單元歸類為“其他”。

我的數據如下所示:

SampleID     TransectID     SampleYear     Season     Location    Taxa1     Taxa2     Taxa3 .... Taxa48
BW15001              1            2015     fall        SiteA         25         0         0           0
BW15001              2            2015     fall        SiteA         32         0         0           2
BW15001              2            2015     fall        SiteA          6         0        45           0
BW15001              3            2015     fall        SiteA         78         1         2           0   

這是我嘗試過的(從此處修改):

y <- rowSums(invert[6:54])
x<-invert[6:54]/y
x<-invert[,order(-colSums(x))]

#Extract list of top N Taxa
N<-8
taxa_list<-colnames(x)[1:N]

#remove "__Unknown__" and add it to others
taxa_list<-taxa_list[!grepl("Unknown",taxa_list)]
N<-length(taxa_list)

#Generate a new table with everything added to Others
new_x<-data.frame(x[,colnames(x) %in% taxa_list],
              Others=rowSums(x[,!colnames(x) %in% taxa_list]))
df<-NULL
for (i in 1:dim(new_x)[2]){
  tmp<-data.frame(row.names=NULL,Sample=rownames(new_x),
  Taxa=rep(colnames(new_x)[i],dim(new_x)    [1]),Value=new_x[,i],Type=grouping_info[,1])
   if(i==1){df<-tmp} else {df<-rbind(df,tmp)}
}

繪制圖形:

colours <- c("#F0A3FF", "#0075DC", "#993F00","#4C005C","#2BCE48","#FFCC99","#808080","#94FFB5","#8F7C00","#9DCC00","#C20088","#003380","#FFA405","#FFA8BB","#426600","#FF0010","#5EF1F2","#00998F","#740AFF","#990000","#FFFF00");

library(ggplot2)
p<-ggplot(df,aes(Sample,Value,fill=Taxa))+
   geom_bar(stat="identity")+
   facet_grid(. ~ Type, drop=TRUE,scale="free",space="free_x")
p<-p+scale_fill_manual(values=colours[1:(N+1)])
p<-p+theme_bw()+ylab("Proportions")
p<-p+ scale_y_continuous(expand = c(0,0))+
  theme(strip.background = element_rect(fill="gray85"))+
  theme(panel.spacing = unit(0.3, "lines"))
p<-p+theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))
p

我今天需要幫助的主要問題是拔出主要分類單元,將其余的分類為“其他”。 我想我可以稍后再使用facet_grid()將圖按季節和位置分組...

謝謝!

一種方法是:

library(plyr)
d=data.frame(SampleID=rep('BW15001',4),
             TransectID=c(1,2,2,3),
             SampleYear=rep(2015,4),
             Taxa1=c(25,32,6,78),
             Taxa2=c(0,0,0,1),
             Taxa3=c(0,0,45,3))
#Reshape the df so that all taxa columns are melted into two
d=melt(d,id=colnames(d[,1:3]))
d$variable=as.character(d$variable)

# rename all uninteresting taxa as 'other'
`%ni%` <- Negate(`%in%`) # Here I decided to select the ones to keep, but the other way around is fine as well of course
d[d$variable %ni% c('Taxa1','Taxa2'),'variable']='Other' #here you could add a function to automatically determine which taxta you want to keep, as you already did

# aggregate all data for 'other'
d=ddply(d,colnames(d[,1:4]),summarise,value=sum(value)) 

#make your plot, this one is just a bad example
ggplot(d,aes(SampleID,value,fill=variable))+
  geom_bar(stat="identity")+
  facet_grid(. ~ Type, drop=TRUE,scale="free",space="free_x")

擴展我的評論。 看一看forcats包。 沒有完整的示例,很難說,但是以下方法應該起作用:

library(tidyverse)
library(forcats)

temp <- df %>%
  gather(taxa, amount, -c(1:5))

# Reshape the data so that that there is one record per each amount
tidy_df <- temp[rep(rownames(temp), times = temp$amount), ]

tidy_df %>%
  select(-amount) %>%
  mutate(taxa = fct_lump(taxa, n = 2)) %>%       # Check out this line
  ggplot(., aes(x = SampleID, fill = taxa)) +
    geom_bar()

您可以將fct_lump(taxa, n = 2)更改為fct_lump(taxa, n = 8)以對前8個類別進行分組。 另外,您可以使用fct_lump(taxa, prop = 0.9)來按比例將事情匯總。

如果您只是追求樣本中分類單元的“存在”(而不是價值或金額),那么事情會更簡單一些,並且可以在一個管道中處理:

df %>%
  gather(taxa, amount, -c(1:5)) %>%
  mutate(amount = na_if(amount, 0)) %>%
  na.omit() %>%
  mutate(taxa = fct_lump(taxa, n = 2)) %>%
  ggplot(., aes(x = SampleID, fill = taxa)) +
   geom_bar()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM