简体   繁体   中英

loop ordinal regression statistical analysis and save the data R

could you, please, help me with a loop? I am relatively new to R. The short version of the data looks ike this:

sNumber  blockNo running TrialNo    wordTar   wordTar1   Freq Len code code2
1        1       1       5           spouse    violent   5011   6    1     2
1        1       1       5          violent     spouse  17873   7    2     1
1        1       1       5           spouse    aviator   5011   6    1     1
1        1       1       5          aviator       wife    515   7    1     1
1        1       1       5             wife    aviator  87205   4    1     1
1        1       1       5          aviator     spouse    515   7    1     1
1        1       1       9        stability    usually  12642   9    1     3
1        1       1       9          usually   requires  60074   7    3     4
1        1       1       9         requires     client  25949   8    4     1
1        1       1       9           client   requires  16964   6    1     4
2        2       1       5            grimy      cloth    757   5    2     1
2        2       1       5            cloth       eats   8693   5    1     4
2        2       1       5             eats    whitens   3494   4    4     4
2        2       1       5          whitens      woman     18   7    4     1
2        2       1       5            woman    penguin 162541   5    1     1
2        2       1       9              pie   customer   8909   3    1     1
2        2       1       9         customer  sometimes  13399   8    1     3
2        2       1       9        sometimes reimburses  96341   9    3     4
2        2       1       9       reimburses  sometimes     65  10    4     3
2        2       1       9        sometimes   gangster  96341   9    3     1

I have a code for ordinal regression analysis for one participant for one trial (eye-tracking data - eyeData) that looks like this:

#------------set the path and import the library-----------------
setwd("/AscTask-3/Data")
library(ordinal)

#-------------read the data----------------
read.delim(file.choose(), header=TRUE) -> eyeData

#-------------extract 1 trial from one participant---------------
ss <- subset(eyeData, sNumber == 6 & runningTrialNo == 21)

#-------------delete duplicates = refixations-----------------
ss.s <- ss[!duplicated(ss$wordTar), ] 

#-------------change the raw frequencies to log freq--------------
ss.s$lFreq <- log(ss.s$Freq)

#-------------add a new column with sequential numbers as a factor ------------------
ss.s$rankF <- as.factor(seq(nrow(ss.s))) 

#------------ estimate an ordered logistic regression model - fit ordered logit model----------
m <- clm(rankF~lFreq*Len, data=ss.s, link='probit')
summary(m)

#---------------get confidence intervals (CI)------------------
(ci <- confint(m)) 

#----------odd ratios (OR)--------------
exp(coef(m))

The eyeData file is a huge massive of data consisting of 91832 observations with 11 variables. In total there are 41 participants with 78 trials each. In my code I extract data from one trial from each participant to run the anaysis. However, it takes a long time to run the analysis manually for all trials for all participants. Could you, please, help me to create a loop that will read in all 78 trials from all 41 participants and save the output of statistics (I want to save summary(m), ci, and coef(m) ) in one file.

Thank you in advance!

You could generate a unique identifier for every trial of every particpant. Then you could loop over all unique values of this identifier and subset the data accordingly. Then you run the regressions and save the output as a R object

eyeData$uniqueIdent <- paste(eyeData$sNumber, eyeData$runningTrialNo, sep = "-")
uniqueID <- unique(eyeData$uniqueIdent)
for (un in uniqueID) {
   ss <- eyeData[eyeData$uniqueID == un,]
   ss <- ss[!duplicated(ss$wordTar), ] #maybe do this outside the loop
   ss$lFreq <- log(ss$Freq)  #you could do this outside the loop too
   #create DV
   ss$rankF <- as.factor(seq(nrow(ss)))
   m <- clm(rankF~lFreq*Len, data=ss, link='probit')
   seeSumm <- summary(m)
   ci <- confint(m) 
   oddsR <- exp(coef(m))
   save(seeSumm, ci, oddsR, file = paste("toSave_", un, ".Rdata", sep = ""))
   # add -un- to the output file to be able identify where it came from
}

Variations of this could include combining the output of every iteration in a list (create an empty list in the beginning) and then after running the estimations and the postestimation commands combine the elements in a list and recursively fill the previously created list "gatherRes":

gatherRes <- vector(mode = "list", length = length(unique(eyeData$uniqueIdent)  ##before the loop
gatherRes[[un]] <- list(seeSum, ci, oddsR)  ##last line inside the loop

If you're concerned with speed, you could consider writing a function that does all this and use lapply (or mclapply).

Here is a solution using the plyr package (it should be faster than a for loop).

Since you don't provide a reproducible example, I'll use the iris data as an example.

First make a function to calculate your statistics of interest and return them as a list. For example:

# Function to return summary, confidence intervals and coefficients from lm
lm_stats = function(x){
  m = lm(Sepal.Width ~ Sepal.Length, data = x)

  return(list(summary = summary(m), confint = confint(m), coef = coef(m)))
}

Then use the dlply function, using your variables of interest as grouping

data(iris)
library(plyr) #if not installed do install.packages("plyr")

#Using "Species" as grouping variable
results = dlply(iris, c("Species"), lm_stats)

This will return a list of lists, containing output of summary , confint and coef for each species.

For your specific case, the function could look like (not tested):

ordFit_stats = function(x){

  #Remove duplicates
  x = x[!duplicated(x$wordTar), ]

  # Make log frequencies
  x$lFreq <- log(x$Freq)

  # Make ranks
  x$rankF <- as.factor(seq(nrow(x)))

  # Fit model
  m <- clm(rankF~lFreq*Len, data=x, link='probit')

  # Return list of statistics
  return(list(summary = summary(m), confint = confint(m), coef = coef(m)))
}

And then:

results = dlply(eyeData, c("sNumber", "TrialNo"), ordFit_stats)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM