简体   繁体   中英

Plot of two variables in R

I have a simple, maybe banal question, but I'm new using R.

I have a data set X, with 3000 observations and 2 variables:

Age ( with a range 2-98 ) 

Generic_Dummy_Variable ( a factor with 2 levels, "yes" and "no" )

Now I was wondering, which is the best way to plot these two variables, maybe using ggplot2.

I tried something like that, but I don't like very much the result, there is too much confusion.

plot(X$Age,col=X$Dummy)

Is there a better way to do that? (what I want to see Is how the "yes" and "no" levels are distributed along the range of age)

Just a starting point...

library(magrittr)
data <- dplyr::tibble(AGE=sample(2:98,size=3000,replace=T),
                      DUMMY=sample(c("yes","no"),size=3000,replace=T))
data %>%
    ggplot2::ggplot(ggplot2::aes(x=DUMMY,y=AGE)) + 
    ggplot2::geom_boxplot()

Try with ggplot2::facet_wrap if you want to retain the detail


library(ggplot2)


df <- data.frame(age = sample(2:98, 3000, replace = TRUE),
                 var = sample(c("yes", "no"), 3000, replace = TRUE))

ggplot(df, aes(age, fill = var))+
  geom_bar(stat = "count", position = "dodge")+
  facet_wrap(~var)

Created on 2020-06-21 by the reprex package (v0.3.0)

You can facet by the dummy variable like so:

library(tidyverse)

X <- bind_cols(Age = sample(2:98, size = 300, replace = TRUE), 
               Generic_Dummy_Variable = sample(c("yes", "no"), size = 300, replace = TRUE))

X %>% ggplot(aes(Age)) +
  geom_histogram() +
  facet_wrap(vars(Generic_Dummy_Variable))

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM