简体   繁体   English

R:线性回归模型中的误差

[英]R : Error in linear regression model

I have a 2 different data frames for which i would like to perform linear regression 我有2个不同的数据框,我想对其进行线性回归

I have written following code for it 我已经为它编写了以下代码

mydir<- "/media/dev/Daten/Task1/subject1/t1"
#multiple subject paths should be given here
# read full paths
myfiles<- list.files(mydir,pattern = "regional_vol*",full.names=T)
# initialise the dataframe from first file 

df<- read.table( myfiles[1], header = F,row.names = NULL, skip = 3, nrows = 1,sep = "\t") 
# [-c(1:3),]
df
#read all the other files and update dataframe
#we read 4 lines to read the header correctly, then remove 3 
ans<- lapply(myfiles[-1], function(x){  read.table( x, header = F, skip = 3, nrows = 1,sep = "\t")       })
ans
#update dataframe
#[-c(1:3),]
lapply(ans, function(x){df<<-rbind(df,x)}  )
#this should be the required dataframe

uncorrect<- array(df)

# Linear regression of ICV extracted from global size FSL 
# Location where your icv is located
ICVdir <- "/media/dev/Daten/Task1/T1_Images"
#loding csv file from ICV
mycsv  <- list.files(ICVdir,pattern = "*.csv",full.names = T )
af<- read.csv(file = mycsv,header = TRUE)
ICV<- as.data.frame(af[,2],drop=FALSE)
#af[1,]
#we take into consideration second column  of csv
#finalcsv <-lapply(mycsv[-1],fudnction(x){read.csv(file="global_size_FSL")})
subj1<- as.data.frame(rep(0.824,each=304))

plot(df ~ subj1, data = df,
       xlab = "ICV value of each subject",
       ylab = "Original uncorrected volume",
       main="intercept calculation"
       )

fit <- lm(subj1 ~ df )

The data frame df has 304 values in following format 数据帧df具有以下格式的304个值

6433 6433     
1430 1430     
1941 1941     
3059 3059     
3932 3932     
6851 6851

and another data frame Subj1 has 304 values in following format 另一个数据帧Subj1具有以下格式的304个值

0.824     
0.824     
0.824      
0.824     
0.824

When i run my code i am incurring following error 当我运行代码时,出现以下错误

Error in model.frame.default(formula = subj1 ~ df, drop.unused.levels = TRUE) : 
  invalid type (list) for variable 'subj1'

any suggestions why the data.frame values from variable subj1 are invalid 任何有关变量subj1的data.frame值为何无效的建议

As mentioned, you are trying to give a data.frame as an independent variable. 如前所述,您试图将data.frame作为自变量。 Try: 尝试:

 fit <- lm(subj1 ~ ., data=df )

This will use all variables in the data frame, as long as subj1 is the dependent variable's name, and not a data frame by itself. 只要subj1是因变量的名称,而不是数据帧本身,它将使用数据帧中的所有变量。

If df has two columns which are the predictors, and subj1 is the predicted (dependent) variable, combing the two, give them proper column names, and create the model in the format above. 如果df有两列是预测变量,而subj1是预测的(因变量),则将两者合并,为其指定适当的列名称,并按上述格式创建模型。

Something like: 就像是:

data <- cbind(df, subj1)
names(data) <- c("var1", "var2", "subj1")
fit <- lm(subj1 ~ var1 + var2, data=df )

Edit: some pointers: 编辑:一些指针:

  1. make sure you use a single data frame that holds all of your independent variables, and your dependent variable. 确保使用包含所有自变量和因变量的单个数据框。
  2. The number of rows should be equal. 行数应相等。
  3. If an independent variable in a constant, it has no variance for different values of the dependent variable, and so will have no meaning. 如果自变量为常量,则因变量的不同值没有方差,因此也就没有意义。 If the dependent variable is a constant, there is no point for regressing - we can predict the value with 100% accuracy. 如果因变量是常数,则没有回归的意义-我们可以以100%的精度预测值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM