简体   繁体   English

在 R 中操作 ggplor2 的分类/连续变量

[英]Manipulating ggplor2 for categorical/continuous variables in R

I am trying to graph categorical and continuous variables together in R. The following code works without "var_4" but I can't seem to get it to work with all of the variables.我试图在 R 中绘制分类变量和连续变量。以下代码在没有“var_4”的情况下工作,但我似乎无法让它与所有变量一起工作。

Could anyone suggest how to fix this?谁能建议如何解决这个问题? Also, is it possible to modify the aes() function so that the bars in each graph are colored differently based on different categories?另外,是否可以修改 aes() 函数,以便每个图形中的条形根据不同的类别具有不同的颜色?


library(ggplot2)
library(gridExtra) 
library(tidyr) 

# Generate data
var_1 <- rnorm(100, 1, 4)
var_2 <- sample(LETTERS[1:2], 100, replace = TRUE, prob = c(0.3, 0.7))
var_3 <- sample(LETTERS[1:5], 100, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.1)) 

cluster <- sample(LETTERS[1:4], 100, replace = TRUE,prob = c(2.5, 2.5, 2.5, 2.5)) 

var_4 <- rnorm(100, 1, 10)

f <- data.frame(var_1, var_2, var_3, var_4, cluster)

f$var_2 = as.factor(f$var_2) 
f$var_3 = as.factor(f$var_3) 
f$cluster = as.factor(f$cluster)

levs <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 10 
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 10

breaks <- c(pretty(range(f$var_1)), sort(unique(c(f$var_2, f$var_3))))

labs <- c(pretty(range(f$var_1)), levs)

f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3", "var_4")) 

ggplot(f, aes(x = value)) + geom_density(data = subset(f, name == "var_1")) + 
  geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) + 
  facet_wrap(cluster~name, ncol = 3, scales = "free") + 
  scale_x_continuous(breaks = breaks, labels = labs) + 
  scale_fill_manual(values = c("deepskyblue4", "gold"), guide = guide_none())

I think the problem here is that you have taken my answer to your previous question and tried to adapt it without really understanding what the various parts did.我认为这里的问题是你已经接受了我对你之前问题的回答,并试图在没有真正理解各个部分做了什么的情况下进行调整。

As I explained previously, facets shouldn't be used as a way of stitching unrelated plots together.正如我之前解释的那样,不应将刻面用作将不相关的图拼接在一起的一种方式。 It is possible, but it is hacky and limits extensibility.这是可能的,但它是hacky 并限制了可扩展性。 Trying to add another variable and custom fill scales for the coloring of the bars is just about possible, but means further tweaks and compromises.尝试为条形着色添加另一个变量和自定义填充比例几乎是可能的,但这意味着进一步的调整和妥协。 It will be very hard to apply this method to your real data unless you know what all the pieces do.除非您知道所有部分的作用,否则很难将此方法应用于您的真实数据。 I have added some comments for clarity:为了清楚起见,我添加了一些评论:

# Generate data
var_1 <- rnorm(100, 1, 4)
var_2 <- sample(LETTERS[1:2], 100, replace = TRUE, prob = c(0.3, 0.7))
var_3 <- sample(LETTERS[1:5], 100, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.1)) 
cluster <- sample(LETTERS[1:4], 100, replace = TRUE,prob = c(2.5, 2.5, 2.5, 2.5)) 
var_4 <- rnorm(100, 1, 10)

f <- data.frame(var_1, var_2, var_3, var_4, cluster)

f$var_2 = as.factor(f$var_2) 
f$var_3 = as.factor(f$var_3) 
f$cluster = as.factor(f$cluster)
# Reorganise factor data into numeric values, grabbing levels as labels first
levs <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 1000
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 1000

# Calculate the breaks and labels for the x axis
breaks <- c(pretty(range(c(f$var_1, f$var_4)), 8), sort(unique(c(f$var_2, f$var_3))))
labs <- c(pretty(range(c(f$var_1, f$var_4)), 8), levs)

# Pivot data
f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3", "var_4")) 

Now we can plot:现在我们可以绘制:

ggplot(f, aes(x = value)) + 
  geom_density(data = subset(f, name == "var_1")) + 
  geom_density(data = subset(f, name == "var_4")) +
  geom_bar(data = subset(f, name != "var_1" & name != "var_4"), 
           aes(fill = factor(value))) + 
  facet_wrap(cluster~name, ncol = 4, scales = "free") + 
  scale_x_continuous(breaks = breaks, labels = labs) + 
  scale_fill_manual(values = c("red", "orange", "gold", "forestgreen", "deepskyblue4"), 
                    guide = guide_none())

在此处输入图片说明

When I ran this, the error thrown was "Error: Insufficient values in manual scale. 3 needed but only 2 provided."当我运行它时,抛出的错误是“错误:手动比例中的值不足。需要 3 个,但只提供了 2 个。”

In your last line you only listed two fill colors.在最后一行中,您只列出了两种填充颜色。 I added "red" and it produced a graph我添加了“红色”并生成了一个图表

ggplot(f, aes(x = value)) + 
    geom_density(data = subset(f, name == "var_1")) + 
    geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) + 
    facet_wrap(cluster~name, ncol = 3, scales = "free") + 
    scale_x_continuous(breaks = breaks, labels = labs) + 
    scale_fill_manual(values = c("deepskyblue4", "gold", "red"), 
     guide = guide_none())

If you get an error, helpful to post it with your question.如果您遇到错误,将其与您的问题一起发布会很有帮助。

resulting graph (is this what you wanted to get?):结果图(这是您想要的吗?): 在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM