简体   繁体   中英

Manipulating ggplor2 for categorical/continuous variables in R

I am trying to graph categorical and continuous variables together in R. The following code works without "var_4" but I can't seem to get it to work with all of the variables.

Could anyone suggest how to fix this? Also, is it possible to modify the aes() function so that the bars in each graph are colored differently based on different categories?


library(ggplot2)
library(gridExtra) 
library(tidyr) 

# Generate data
var_1 <- rnorm(100, 1, 4)
var_2 <- sample(LETTERS[1:2], 100, replace = TRUE, prob = c(0.3, 0.7))
var_3 <- sample(LETTERS[1:5], 100, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.1)) 

cluster <- sample(LETTERS[1:4], 100, replace = TRUE,prob = c(2.5, 2.5, 2.5, 2.5)) 

var_4 <- rnorm(100, 1, 10)

f <- data.frame(var_1, var_2, var_3, var_4, cluster)

f$var_2 = as.factor(f$var_2) 
f$var_3 = as.factor(f$var_3) 
f$cluster = as.factor(f$cluster)

levs <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 10 
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 10

breaks <- c(pretty(range(f$var_1)), sort(unique(c(f$var_2, f$var_3))))

labs <- c(pretty(range(f$var_1)), levs)

f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3", "var_4")) 

ggplot(f, aes(x = value)) + geom_density(data = subset(f, name == "var_1")) + 
  geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) + 
  facet_wrap(cluster~name, ncol = 3, scales = "free") + 
  scale_x_continuous(breaks = breaks, labels = labs) + 
  scale_fill_manual(values = c("deepskyblue4", "gold"), guide = guide_none())

I think the problem here is that you have taken my answer to your previous question and tried to adapt it without really understanding what the various parts did.

As I explained previously, facets shouldn't be used as a way of stitching unrelated plots together. It is possible, but it is hacky and limits extensibility. Trying to add another variable and custom fill scales for the coloring of the bars is just about possible, but means further tweaks and compromises. It will be very hard to apply this method to your real data unless you know what all the pieces do. I have added some comments for clarity:

# Generate data
var_1 <- rnorm(100, 1, 4)
var_2 <- sample(LETTERS[1:2], 100, replace = TRUE, prob = c(0.3, 0.7))
var_3 <- sample(LETTERS[1:5], 100, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.1)) 
cluster <- sample(LETTERS[1:4], 100, replace = TRUE,prob = c(2.5, 2.5, 2.5, 2.5)) 
var_4 <- rnorm(100, 1, 10)

f <- data.frame(var_1, var_2, var_3, var_4, cluster)

f$var_2 = as.factor(f$var_2) 
f$var_3 = as.factor(f$var_3) 
f$cluster = as.factor(f$cluster)
# Reorganise factor data into numeric values, grabbing levels as labels first
levs <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 1000
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 1000

# Calculate the breaks and labels for the x axis
breaks <- c(pretty(range(c(f$var_1, f$var_4)), 8), sort(unique(c(f$var_2, f$var_3))))
labs <- c(pretty(range(c(f$var_1, f$var_4)), 8), levs)

# Pivot data
f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3", "var_4")) 

Now we can plot:

ggplot(f, aes(x = value)) + 
  geom_density(data = subset(f, name == "var_1")) + 
  geom_density(data = subset(f, name == "var_4")) +
  geom_bar(data = subset(f, name != "var_1" & name != "var_4"), 
           aes(fill = factor(value))) + 
  facet_wrap(cluster~name, ncol = 4, scales = "free") + 
  scale_x_continuous(breaks = breaks, labels = labs) + 
  scale_fill_manual(values = c("red", "orange", "gold", "forestgreen", "deepskyblue4"), 
                    guide = guide_none())

在此处输入图片说明

When I ran this, the error thrown was "Error: Insufficient values in manual scale. 3 needed but only 2 provided."

In your last line you only listed two fill colors. I added "red" and it produced a graph

ggplot(f, aes(x = value)) + 
    geom_density(data = subset(f, name == "var_1")) + 
    geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) + 
    facet_wrap(cluster~name, ncol = 3, scales = "free") + 
    scale_x_continuous(breaks = breaks, labels = labs) + 
    scale_fill_manual(values = c("deepskyblue4", "gold", "red"), 
     guide = guide_none())

If you get an error, helpful to post it with your question.

resulting graph (is this what you wanted to get?): 在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM