简体   繁体   English

在R中创建包含3个变量的图

[英]Creating plots in R with 3 variables

I had been following the analysis steps in the Little Book of R . 我一直在遵循R小书中的分析步骤。 It is a great tutorial but for things to work, it and many other tutorials need to have the data organized a certain way. 这是一个很棒的教程,但是对于要工作的东西,它和许多其他教程需要以某种方式组织数据。

My data is structured like this (a very tiny tiny sample)a: 我的数据结构如下(一个非常小的样本)a:

Phylum Confidence Time Seq_ID Environment Dataset
Acidobacteria 0.801 5 >3134898 Marine 4440037.3
Bacteroidetes 0.812 6 >3066473 Marine 4440037.3
Acidobacteria 0.828 5 >3085551 Gut 4440038.3
Firmicutes    0.830 4 >3087676 Coral 4440036.3

I want a good way to 我想要一个好方法

a) Plot the Time by bacterial phylum for each environment. a)通过每个环境的细菌门绘制时间。 I realize that this means I will need to created a plot for each phylum. 我意识到这意味着我需要为每个门创建一个情节。 (see plots ) (见地块

b. Plot the time by environment of two different phyla which I will then color code by environment. 用两个不同的门环境绘制时间,然后我将根据环境进行颜色编码。 (see plots ) (见地块

I know I can create a new dataframe based on an environment and bacteria, but I have not been able to incorporate it correctly into a plot that uses a third variable (time). 我知道我可以根据环境和细菌创建一个新的数据帧,但我无法将其正确地合并到使用第三个变量(时间)的图中。

new_df = myDF[(myDF$Environment=='Marine') & (myDF$Phylum=='Acidobacteria'),]

I have tried several things... 我尝试过几件事......

p <- ggplot(myDF, aes(Environment, Time))
p + geom_boxplot(aes(fill = Environment))

It creates a plot, but this does not take into consideration the phylum (eg I would like a separate plot for each phylum). 它创建了一个情节,但这没有考虑到门(例如,我想为每个门单独绘制一个图)。

Or this... 或这个...

 for (i in environment) #this is a list I created
 {
     for (j in phyla) #this is a list I created
     {
        #stats_df = myDF[(myDF$Environment==i) & (myDF$Phylum==j),]
        plot(myDF[[j]], myDF[[i]], xlab=NULL, ylab='Time')
      }
 }

This one gets errors 这个得到错误

Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Error in plot.window(...) : need finite 'xlim' values
Calls: plot -> plot.default -> localWindow -> plot.window
Execution halted

shell returned 1

But even if it did plot, it still does not take into consideration the Time variable. 但即使它确实是情节,它仍然没有考虑时间变量。 What I am really trying to figure out is how to use three variables in a plot. 我真正想弄清楚的是如何在一个图中使用三个变量。

Assuming phylum is a factor variable 假设门是一个因子变量

library(ggplot2)
g<- ggplot(myDF, aes(Environment, Time))
g + geom_point() + facet_grid(. ~ phylum)

在此输入图像描述

 library(ggplot2)

 g<- ggplot(df1, aes(Environment, Time))
 g + geom_point() + facet_grid(phylum ~ .)

在此输入图像描述

Please see here for the details. 请看这里的详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM