简体   繁体   中英

R code of scatter plot for three variables

Hi I am trying to code for a scatter plot for three variables in R:

Race= [0,1]
YOI= [90,92,94]
ASB_mean = [1.56, 1.59, 1.74]

Antisocial <- read.csv(file = 'Antisocial.csv')
Table_1 <- ddply(Antisocial, "YOI", summarise, ASB_mean = mean(ASB))
Table_1
Race <- unique(Antisocial$Race)
Race
ggplot(data = Table_1, aes(x = YOI, y = ASB_mean, group_by(Race))) + 
geom_point(colour = "Black", size = 2) + geom_line(data = Table_1, aes(YOI, 
ASB_mean), colour = "orange", size = 1) 

Image of plot: https://drive.google.com/file/d/1E-ePt9DZJaEr49m8fguHVS0thlVIodu9/view?usp=sharing

Data file: https://drive.google.com/file/d/1UeVTJ1M_eKQDNtvyUHRB77VDpSF1ASli/view?usp=sharing

Can someone help me understand where I am making mistake? I want to plot mean ASB vs YOI grouped by Race. Thanks.

I am not sure what is your desidered output. Maybe, if I well understood your question I Think that you want somthing like this.

g_Antisocial <- Antisocial %>% 
        group_by(Race) %>% 
        summarise(ASB = mean(ASB),
                  YOI = mean(YOI))
Antisocial %>% 
        ggplot(aes(x = YOI, y = ASB, color = as_factor(Race), shape = as_factor(Race))) +
        geom_point(alpha = .4) +
        geom_point(data = g_Antisocial, size = 4) +
        theme_bw() +
        guides(color = guide_legend("Race"),  shape = guide_legend("Race")) 

and this is the output: 在此处输入图像描述

@Maninder: there are a few things you need to look at.

First of all: The grammar of graphics of ggplot() works with layers. You can add layers with different data (frames) for the different geoms you want to plot.

The reason why your code is not working is that you mix the layer call and or do not really specify (and even mix) what is the scatter and line visualisation you want.

(I) Use ggplot() + geom_point() for a scatter plot
The ultimate first layer is: ggplot() . Think of this as your drawing canvas. You then speak about adding a scatter plot layer, but you actually do not do it.

For example:

# plotting antisocal data set
ggplot() + 
  geom_point(data = Antisocial, aes(x = YOI, y = ASB, colour = as.factor(Race)))

will plot your Antiscoial data set using the scatter , ie geom_point() layer. Note that I put Race as a factor to have a categorical colour scheme otherwise you might end up with a continous palette.

在此处输入图像描述

(II) line plot
In analogy to above, you would get for the line plot the following:

# plotting Table_1
ggplot() +
  geom_line(data = Table_1, aes(x = YOI, y = ASB_mean))

I save showing the plot of the line.

(III) combining different layers

# putting both together
ggplot() +
  geom_point(data = Antisocial, aes(x = YOI, y = ASB, colour = as.factor(Race))) +
  geom_line(data = Table_1, aes(x = YOI, y = ASB_mean)) +

## this is to set the legend title and have a nice(r) name in your colour legend
  labs(colour = "Race")

This yields:

在此处输入图像描述

That should explain how ggplot-layering works. Keep an eye on the datasets and geoms that you want to use. Before working with inheritance in aes, I recommend to keep the data= and aes() call in the geom_xxxx. This avoids confustion.

You may want to explore with geom_jitter() instead of geom_point() to get a bit of a better presentation of your dataset. The "few" points plotted are the result of many datapoints in the same position (and overplotted).

Moving away from plotting to your question "I want to plot mean ASB vs YOI grouped by Race." I know too little about your research to fully comprehend what you mean with that. I take it that the mean ASB you calculated over the whole population is your reference (aka your Table_1), and you would like to see how the Race groups feature vs this population mean.

One option is to group your race data points and show them as boxplots for each YOI. This might be what you want. The boxplot gives you the median and quartiles, and you can compare this per group against the calculated ASB mean.

For presentation purposes, I highlighted the line by increasing its size and linetype. You can play around with the colours, etc. to give you the aesthetics you aim for. Please note, that for the grouped boxplot, you also have to treat your integer variable YOI, I coerced into a categorical factor. Boxplot works with fill for the body (colour sets only the outer line). In this setup, you also need to supply a group value to geom_line() (I just assigned it to 1, but that is arbitrary - in other contexts you can assign another variable here).

ggplot() +
  geom_boxplot(data = Antisocial, aes(x = as.factor(YOI), y = ASB, fill = as.factor(Race)))  +
  geom_line(data = Table_1, aes(x = as.factor(YOI), y = ASB_mean, group = 1)
            , size = 2, linetype = "dashed") +
  labs(x = "YOI", fill = "Race") 

在此处输入图像描述

Hope this gets you going!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM