简体   繁体   English

r 中带有 4 个变量的分组条形图

[英]Grouped barchart in r with 4 variables

I'm a beginner in r and I've been trying to find how I can plot this graphic.我是 r 的初学者,我一直在努力寻找如何 plot 这个图形。

I have 4 variables (% of gravel, % of sand, % of silt in five places).我有 4 个变量(砾石百分比、沙子百分比、五个地方的淤泥百分比)。 I'm trying to plot the percentages of these 3 types of sediment (y) in each station (x).我正在尝试 plot 这三种沉积物 (y) 在每个站 (x) 中的百分比。 So it's five groups in x axis and 3 bars per group.所以它在 x 轴上有 5 个组,每组 3 个条形图。


   Station   % gravel    % sand      % silt
1   PRA1    28.430000   70.06000    1.507000
2   PRA3    19.515000   78.07667    2.406000
3   PRA4    19.771000   78.63333    1.598333
4   PRB1    7.010667    91.38333    1.607333
5   PRB2    18.613333   79.62000    1.762000 

I tried plotting a grouped barchart with我尝试绘制一个分组条形图

grao <- read_excel("~/Desktop/Masters/Data/grao.xlsx")
colors <- c('#999999','#E69F00','#56B4E9','#94A813','#718200')
barplot(table(grao$Station, grao$`% gravel`, grao$`% sand`, grao$`% silt`), beside = TRUE, col = colors)

But this error message keeps happening:但是此错误消息不断发生:

'height' must be a vector or matrix 'height' 必须是向量或矩阵

I also tried我也试过

ggplot(grao, aes(Station, color=as.factor(`% gravel`), shape=as.factor(`% sand`))) + 
geom_bar() + scale_color_manual(values=c('#999999','#E69F00','#56B4E9','#94A813','#718200')+ theme(legend.position="top")

But it's creating a crazy graphic.但它正在创造一个疯狂的图形。

Could someone help me, please?有人可以帮我吗? I've been stuck for weeks now in this one.我已经被困在这几个星期了。

Cheers干杯

I think this may be what you are looking for:我认为这可能是您正在寻找的:

#install.packages("tidyverse")
library(tidyverse)
df <-  data.frame(
  station = c("PRA1", "PRA3", "PRA4", "PRB1", "PRB2"),
  gravel = c(28.4, 19.5, 19.7, 7.01, 18.6),
  sand = c(70.06, 78.07, 78.63, 91, 79),
  silt = c(1.5, 2.4, 1.6, 1.7, 1.66)
)

df2 <- df %>% 
  pivot_longer(cols = c("gravel", "sand", "silt"), names_to = "Sediment_Type", values_to = "Percentage")

ggplot(df2) +
  geom_bar(aes(x = station, y = Percentage, fill = Sediment_Type ), stat = "identity", position = "dodge") +
theme_minimal() #theme_minimal() is from the ggthemes package

provides:提供:

数据的ggplot You need to "pivot" your data set "longer".您需要“更长”地“转动”您的数据集。 Part of the tidy way is ensuring all columns represent a single variable.部分整洁的方法是确保所有列都代表一个变量。 You will notice in your initial dataframe that each column name is a variable ("Sediment_type") and each column fill is just the percentage for each ("Percentage").您会在最初的 dataframe 中注意到,每个列名都是一个变量(“Sediment_type”),每个列填充只是每个列的百分比(“Percentage”)。 The function pivot_longer() takes a dataset and allows one to gather up all the columns then turn them into just two - the identity and value. function pivot_longer()采用数据集并允许收集所有列,然后将它们变成两个 - 标识和值。

Once you've done this, ggplot will allow you to specify your x axis, and then a grouping variable by "fill".完成此操作后,ggplot 将允许您指定 x 轴,然后通过“填充”指定分组变量。 You can switch these two up.你可以切换这两个。 If you end up with lots of data and grouping variables, faceting is also an option worth looking in to!如果您最终得到大量数据和分组变量,分面也是一个值得研究的选项!

Hope this helps,希望这可以帮助,

Brennan布伦南

barplot wants a "matrix" , ideally with both dimension names. barplot想要一个"matrix" ,理想情况下具有两个维度名称。 You could transform your data like this (remove first column while using it for row names):您可以像这样转换数据(在将第一列用于行名时删除第一列):

dat <- `rownames<-`(as.matrix(grao[,-1]), grao[,1])

You will see, that barplot already does the tabulation for you.你会看到,那个barplot已经为你做了制表。 However, you also could use xtabs ( table might not be the right function for your approach).但是,您也可以使用xtabstable可能不适合您的方法 function)。

# dat <- xtabs(cbind(X..gravel, X..sand, X..silt) ~ Station, grao)  ## alternatively

I would advise you to use proper variable names, since special characters are not the best idea.我建议您使用正确的变量名,因为特殊字符不是最好的主意。

colnames(dat) <- c("gravel", "sand", "silt")
dat
#         gravel     sand     silt
# PRA1 28.430000 70.06000 1.507000
# PRA3 19.515000 78.07667 2.406000
# PRA4 19.771000 78.63333 1.598333
# PRB1  7.010667 91.38333 1.607333
# PRB2 18.613333 79.62000 1.762000

Then barplot knows what's going on.然后barplot知道发生了什么。

.col <- c('#E69F00','#56B4E9','#94A813')  ## pre-define colors
barplot(t(dat), beside=T, col=.col, ylim=c(0, 100),  ## barplot
        main="Here could be your title", xlab="sample", ylab="perc.")
legend("topleft", colnames(dat), pch=15, col=.col, cex=.9, horiz=T, bty="n")  ## legend
box()  ## put it in a box

在此处输入图像描述


Data:数据:

grao <- read.table(text="   Station   '% gravel'    '% sand'      '% silt'
1   PRA1    28.430000   70.06000    1.507000
2   PRA3    19.515000   78.07667    2.406000
3   PRA4    19.771000   78.63333    1.598333
4   PRB1    7.010667    91.38333    1.607333
5   PRB2    18.613333   79.62000    1.762000 ", header=TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM