r 中带有 4 个变量的分组条形图

Question

我是 r 的初学者，我一直在努力寻找如何 plot 这个图形。

我有 4 个变量（砾石百分比、沙子百分比、五个地方的淤泥百分比）。 我正在尝试 plot 这三种沉积物 (y) 在每个站 (x) 中的百分比。 所以它在 x 轴上有 5 个组，每组 3 个条形图。


   Station   % gravel    % sand      % silt
1   PRA1    28.430000   70.06000    1.507000
2   PRA3    19.515000   78.07667    2.406000
3   PRA4    19.771000   78.63333    1.598333
4   PRB1    7.010667    91.38333    1.607333
5   PRB2    18.613333   79.62000    1.762000

我尝试绘制一个分组条形图

grao <- read_excel("~/Desktop/Masters/Data/grao.xlsx")
colors <- c('#999999','#E69F00','#56B4E9','#94A813','#718200')
barplot(table(grao$Station, grao$`% gravel`, grao$`% sand`, grao$`% silt`), beside = TRUE, col = colors)

但是此错误消息不断发生：

'height' 必须是向量或矩阵

我也试过

ggplot(grao, aes(Station, color=as.factor(`% gravel`), shape=as.factor(`% sand`))) + 
geom_bar() + scale_color_manual(values=c('#999999','#E69F00','#56B4E9','#94A813','#718200')+ theme(legend.position="top")

但它正在创造一个疯狂的图形。

有人可以帮我吗？ 我已经被困在这几个星期了。

干杯

Answer 1

我认为这可能是您正在寻找的：

#install.packages("tidyverse")
library(tidyverse)
df <-  data.frame(
  station = c("PRA1", "PRA3", "PRA4", "PRB1", "PRB2"),
  gravel = c(28.4, 19.5, 19.7, 7.01, 18.6),
  sand = c(70.06, 78.07, 78.63, 91, 79),
  silt = c(1.5, 2.4, 1.6, 1.7, 1.66)
)

df2 <- df %>% 
  pivot_longer(cols = c("gravel", "sand", "silt"), names_to = "Sediment_Type", values_to = "Percentage")

ggplot(df2) +
  geom_bar(aes(x = station, y = Percentage, fill = Sediment_Type ), stat = "identity", position = "dodge") +
theme_minimal() #theme_minimal() is from the ggthemes package

提供：

您需要“更长”地“转动”您的数据集。 部分整洁的方法是确保所有列都代表一个变量。 您会在最初的 dataframe 中注意到，每个列名都是一个变量（“Sediment_type”），每个列填充只是每个列的百分比（“Percentage”）。 function pivot_longer()采用数据集并允许收集所有列，然后将它们变成两个 - 标识和值。

完成此操作后，ggplot 将允许您指定 x 轴，然后通过“填充”指定分组变量。 你可以切换这两个。 如果您最终得到大量数据和分组变量，分面也是一个值得研究的选项！

希望这可以帮助，

布伦南

Answer 2

barplot想要一个"matrix" ，理想情况下具有两个维度名称。 您可以像这样转换数据（在将第一列用于行名时删除第一列）：

dat <- `rownames<-`(as.matrix(grao[,-1]), grao[,1])

你会看到，那个barplot已经为你做了制表。 但是，您也可以使用xtabs （ table可能不适合您的方法 function）。

# dat <- xtabs(cbind(X..gravel, X..sand, X..silt) ~ Station, grao)  ## alternatively

我建议您使用正确的变量名，因为特殊字符不是最好的主意。

colnames(dat) <- c("gravel", "sand", "silt")
dat
#         gravel     sand     silt
# PRA1 28.430000 70.06000 1.507000
# PRA3 19.515000 78.07667 2.406000
# PRA4 19.771000 78.63333 1.598333
# PRB1  7.010667 91.38333 1.607333
# PRB2 18.613333 79.62000 1.762000

然后barplot知道发生了什么。

.col <- c('#E69F00','#56B4E9','#94A813')  ## pre-define colors
barplot(t(dat), beside=T, col=.col, ylim=c(0, 100),  ## barplot
        main="Here could be your title", xlab="sample", ylab="perc.")
legend("topleft", colnames(dat), pch=15, col=.col, cex=.9, horiz=T, bty="n")  ## legend
box()  ## put it in a box

数据：

grao <- read.table(text="   Station   '% gravel'    '% sand'      '% silt'
1   PRA1    28.430000   70.06000    1.507000
2   PRA3    19.515000   78.07667    2.406000
3   PRA4    19.771000   78.63333    1.598333
4   PRB1    7.010667    91.38333    1.607333
5   PRB2    18.613333   79.62000    1.762000 ", header=TRUE)

r 中带有 4 个变量的分组条形图

问题描述

2 个解决方案

解决方案1
1 2020-04-29 16:19:52

解决方案2
0 已采纳 2020-04-29 16:36:21

r 中带有 4 个变量的分组条形图

问题描述

2 个解决方案

解决方案1 1 2020-04-29 16:19:52

解决方案2 0 已采纳 2020-04-29 16:36:21

解决方案1
1 2020-04-29 16:19:52

解决方案2
0 已采纳 2020-04-29 16:36:21