简体   繁体   English

多重绘图功能R

[英]Multiple plot function R

I am new to R and have been trying to find a solution to this for the past week via google and forums. 我是R的新手,并且过去一周一直在尝试通过Google和论坛找到解决方案。 My problem: I have a data set which I need to plot against age. 我的问题:我有一个数据集,需要针对年龄进行绘制。 There are over a 1000 variables with different measurement during in 40 different conditions. 在40种不同条件下,有1000多个变量具有不同的测量值。 Looks like this: 看起来像这样:

Age   Variables1  Variable2 (....) Variable1000 > 
 |        |
 |        |
 v        v

What I need to do is plot the condition(age) against each of the columns of variables and output as different plots (all of this is just scatterplots). 我需要做的是针对每个变量列绘制条件(年龄)并以不同的图形式输出(所有这些只是散点图)。 What is more, I want the output to be limited to only those variables that have a positive trend line coefficient. 而且,我希望将输出限制为仅具有正趋势线系数的变量。

So currently I have this very ugly code that is essentially a rough draft of what I really need. 因此,目前我的代码非常丑陋,基本上是我真正需要的草稿。

plotest <- function(lung){
  # need to add the condition of abline function coefficient > 0 before plotting    
  plot(lung$Age, lung$hsa.let.7a.1, xlab = "Age", ylab = "miRNA")
  abline(lm(lung$hsa.let.7a.1 ~ lung$Age), col= "red")
  return(plot)
}
par(mfrow=c(2,2))
for (i in lung{plotest(i)})

I know this is mostly wrong. 我知道这主要是错误的。 So sorry for the horrendous everything about it. 对于所有可怕的事情感到抱歉。

Could anyone direct me to any sources, which I might have overlooked in how to specify ranges in such large datasets? 谁能将我定向到任何来源,但在如此大的数据集中如何指定范围却可能被我忽略了? And function grammar? 和功能语法? I have done some Python but found R to be much more confusing in this regard... 我做过一些Python,但是发现R在这方面更加令人困惑...

Thanks all, Paul 谢谢大家,保罗

This should come pretty close to what you're asking for, although what you're going to do with 1000 graphs is beyond me. 这应该非常接近您的要求,尽管您要处理的1000张图形超出了我的范围。

# make up some data
x <- seq(1,10,len=100)
set.seed(1)    # for reproducible example
df <- data.frame(x,y1=1+2*x+rnorm(100), 
                   y2=3-4*x+rnorm(100),
                   y3=2+0.001*x+rnorm(100))

# this does the work...
lapply(colnames(df)[-1],function(col){
  form <- formula(paste(col,"x",sep="~"))
  fit  <- lm(form,df)
  if (coef(fit)[2] >0) {
    plot(form,df)
    abline(fit)
  }
})

Your code was not that far off. 您的代码不是那么遥远。 This example takes all the column names except the first one ( colnames(df)[1] ) and passes them one at a time to the function. 此示例采用除第一个列名称之外的所有列名称( colnames(df)[1] ),并将它们一次传递给函数。 The function creates a formula variable using the column name and the name of the first column, calls lm(...) , checks that the coefficient of x is > 0, and if so plots the data and the best fit line. 该函数使用列名和第一列的名称创建一个公式变量,调用lm(...) ,检查x的系数是否> 0,如果是,则绘制数据和最佳拟合线。

Look up the documentation on formula(...) , lm(...) , and coef(...) . 查找有关formula(...)lm(...)coef(...)的文档。 Note that this example has a variable, y3 with a slope that is positive, but not significantly different from 0. You should think about how you want to deal with that situation. 请注意,此示例具有一个变量y3 ,其斜率为正,但与0的斜率没有显着差异。您应该考虑如何处理这种情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM