简体   繁体   中英

How to combine data from multiple .txt files into one dataframe (transposing required) in R

I'm struggling with the import/preparation of multiple datafiles. Each participant in my study has multiple data files in.txt format. I would like to create one master datafile that contains the data from all participants.

I figured out how to do this with a subset of my data using the code below:

setwd("...")

data1 = read.delim("Data-2021-5-11-16-17-49.txt", header = F, sep=";")
data2 = read.delim("Data-2021-5-11-16-18-6.txt", header = F, sep=";")
data3 = read.delim("Data-2021-5-11-16-18-19.txt", header = F, sep=";")

data1 = as.data.frame(t(data1))
colnames(data1) = c("finalBarAngle", "finalBarAngleRelativeToCamera", "HeadTilt", "HeadTranslation", "BarNDisks", "BarImageRatio", "DiskBarRatio", "BarDepth", "BarColor", "BarTopColor", "BarAngle", "BarRight", "BarLeft", "BarReady", "BarStepAngle", "KeyRepeatedDelay", "BarStepAngle", "ImageFile", "ImageSize", "ImageSizeDepthRatio", "ImageAngle", "ImageSpeed", "ChromaKey", "BackgroundColor")
data1 = data1[-1,]

data2 = as.data.frame(t(data2))
colnames(data2) = colnames(data1)
data2 = data2[-1,]

data3 = as.data.frame(t(data3))
colnames(data3) = colnames(data1)
data3 = data3[-1,]

data = rbind(data1,data2,data3)

This works for me and generates the master datafile that I would like. However, since each participant has 80 of these datafiles, it would be very cumbersome to follow this approach. Therefore, I'm looking for a piece of code that could do this automatically?

I tried using list.files and lapply, but I don't get the result I want. The problem is that the content of the lists becomes a factor and that I would first need to transpose the dataframes within the list:

mypath = "..."
txt_files_ls = list.files(path=mypath, pattern="*.txt") 
txt_files_df <- lapply(txt_files_ls, function(x) {read.table(file = x, header = F, sep =";")})
transposeList <- lapply(txt_files_df,function(x){t(x)[-1,]})
combined_df <- do.call("rbind", lapply(transposeList, as.data.frame)) 

Example data from one of the.txt files for reproduction of the problem (all of the input files follow this structure):

finalBarAngle;10
finalBarAngleRelativeToCamera;15
HeadTilt;True
HeadTranslation;True
BarImageRatio;8
DiskBarRatio;0.8
DiskBarRatio;0.25
BarDepth;2
BarColor;RGBA(0.000, 1.000, 0.000, 1.000)
BarTopColor;RGBA(1.000, 0.000, 1.000, 1.000)
BarAngle;23
BarRight;Keypad6
BarLeft;Keypad0
BarReady;Keypad5
BarStepAngle;0.5
BarStepAngle;750
BarStepAngle;0.1
KeyRepeatDelay;Frame.png
BarRepeatedStep;500
ImageFile;0.6666667
ImageDepth;-28
ImageSizeDepthRatio;0
ImageAngle;RGBA(0.000, 0.000, 0.000, 1.000)
ImageSpeed;RGBA(0.000, 0.000, 0.000, 1.000)

Note that the variable names in my original files are not correct (there are some duplicates wherefore some other names are missing).

Maybe if you import the data with stringsAsFactors = FALSE) , and transpose the dataframes on the fly, within the first lapply call, instead of calling a second loop just to trasnpose the data?

Something like:

mypath = "..."
txt_files_ls = list.files(path=mypath, pattern="*.txt") 
txt_files_df <- lapply(txt_files_ls, function(x) {
     output<-read.table(file = x, header = F, sep =";", stringsAsFactors = F)
     output<-t(output)[-1]})

combined_df <- do.call(rbind, output) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM