简体   繁体   English

如何创建具有不同行数 RData 的 data.frame

[英]How to create data.frame with different number of rows RData

I have a file (format RData).我有一个文件(格式 RData)。 https://stepik.org/media/attachments/course/724/all_data.Rdata This file contains 7 lists with id and temperature of patients. https://stepik.org/media/attachments/course/724/all_data.Rdata该文件包含 7 个带有患者 ID 和体温的列表。 I need to make one data.frame from these lists and then remove all rows with NA我需要从这些列表中创建一个 data.frame,然后删除所有带有 NA 的行

  id     temp   i.temp i.temp.1 i.temp.2 i.temp.3 i.temp.4 i.temp.5
 1:  1 36.70378 36.73161 36.22944 36.05907 35.66014 37.32798 35.88121
 2:  2 36.43545 35.96814 36.86782 37.20890 36.45172 36.82727 36.83450
 3:  3 36.87599 36.38842 36.70508 37.44710 36.73362 37.09359 35.92993
 4:  4 36.17120 35.95853 36.33405 36.45134 37.17186 36.87482 35.45489
 5:  5 37.20341 37.04881 36.53252 36.22922 36.78106 36.89219 37.13207
 6:  6 36.12201 36.53433 37.29784 35.96451 36.70838 36.58684 36.60122
 7:  7 36.92314 36.16220 36.48154 37.05324 36.57829 36.24955 37.23835
 8:  8 35.71390 37.26879 37.01673 36.65364 36.89143 36.46331 37.15398
 9:  9 36.63558 37.03452 36.40129 37.53705 36.03568 36.78083 36.71873
10: 10 36.77329 36.07161 36.42992 36.20715 36.78880 36.79875 36.15004
11: 11 36.66199 36.74958 36.28661 36.72539 36.17700 37.47495 35.60980
12: 12       NA 36.97689 36.00473 36.64292 35.96789 36.73904 36.93957
13: 13       NA       NA       NA       NA       NA 36.63760 36.83916
14: 14 37.40307 35.89668 36.30619 36.64382 37.21882 35.87420 35.45550
15: 15       NA       NA       NA 37.03758 36.72512 36.45281 37.54388
16: 16       NA 36.44912 36.57126 36.20703 36.83076 36.48287 35.99391
17: 17       NA       NA       NA 36.39900 36.54043 36.75989 36.47079
18: 18 36.51696 37.09903 37.31166 36.51000 36.42414 36.87976 36.45736
19: 19 37.05117 37.42526 36.15820 36.11824 37.07024 36.60699 36.80168
20: 20       NA       NA       NA       NA       NA       NA 36.74118

I wrote:我写:

load("https://stepik.org/media/attachments/course/724/all_data.Rdata")
library(data.table) 
day1<-as.data.table(all_data[1])
day2<-as.data.table(all_data[2])
day3<-as.data.table(all_data[3])
day4<-as.data.table(all_data[4])
day5<-as.data.table(all_data[5])
day6<-as.data.table(all_data[6])
day7<-as.data.table(all_data[7])
setkey(day1, id)
setkey(day2, id)
setkey(day3, id)
setkey(day4, id)
setkey(day5, id)
setkey(day6, id)
setkey(day7, id)
all_day<-day1[day2,][day3, ][day4,][day5,][day6,][day7,]
all_day<-na.omit(all_day)

But it takes too long.但是时间太长了。 How can I make it faster?我怎样才能让它更快?

here is a data.table solution这是一个data.table解决方案

library( data.table )
#set names for all_data
names( all_data ) <- paste0( "day", 1:length(all_data))
#bind lists to data.table
DT <- data.table::rbindlist( all_data, use.names = TRUE, fill = TRUE, idcol = "day" )
#cast to wide
ans <- dcast( DT, id ~ day, value.var = "temp" )
#only keep complete rows and present output (using [] at the end)
ans[ complete.cases( ans ), ][]


#    id     day1     day2     day3     day4     day5     day6     day7
# 1:  1 36.70378 36.73161 36.22944 36.05907 35.66014 37.32798 35.88121
# 2:  2 36.43545 35.96814 36.86782 37.20890 36.45172 36.82727 36.83450
# 3:  3 36.87599 36.38842 36.70508 37.44710 36.73362 37.09359 35.92993
# 4:  4 36.17120 35.95853 36.33405 36.45134 37.17186 36.87482 35.45489
# 5:  5 37.20341 37.04881 36.53252 36.22922 36.78106 36.89219 37.13207
# 6:  6 36.12201 36.53433 37.29784 35.96451 36.70838 36.58684 36.60122
# 7:  7 36.92314 36.16220 36.48154 37.05324 36.57829 36.24955 37.23835
# 8:  8 35.71390 37.26879 37.01673 36.65364 36.89143 36.46331 37.15398
# 9:  9 36.63558 37.03452 36.40129 37.53705 36.03568 36.78083 36.71873
# 10:10 36.77329 36.07161 36.42992 36.20715 36.78880 36.79875 36.15004
# 11:11 36.66199 36.74958 36.28661 36.72539 36.17700 37.47495 35.60980
# 12:14 37.40307 35.89668 36.30619 36.64382 37.21882 35.87420 35.45550
# 13:18 36.51696 37.09903 37.31166 36.51000 36.42414 36.87976 36.45736
# 14:19 37.05117 37.42526 36.15820 36.11824 37.07024 36.60699 36.80168

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何`data.frame`具有不同的行数但相关(不是`by`) - How to `data.frame` with different number of rows but related (not `by`) 如何创建具有定义的行数的data.frame? - How to create data.frame with defined number of rows? 将Rdata文件转换为CSV - data.frame参数中的错误意味着行数不同 - Converting Rdata files to CSV - Error in data.frame arguments imply differing number of rows 如何打开多个.RDATA并将其中一个名称保存为data.frame - How to open multiple .RDATA and save one of there names as data.frame 如何在物理RData中直接使用data.frame? - How to directly work with a data.frame in physical RData? 生成具有不同行数的两个 data.frame 的所有组合的成对 data.frame - Generate pairwise data.frame of all combinations of two data.frame with different number of rows R:将具有不同行数的列表转换为data.frame - R: Convert list with different number of rows to data.frame 如何对每个 integer 行数的 data.frame 进行分组? - How to group a data.frame each integer number of rows? 如何在 data.frame 中创建一个新列,以便该列计算该 data.frame 中不同行的数量? - How to make a new column in a data.frame so that column counts the number of different row in that data.frame? 如何在 R 中创建具有匹配行和列的 data.frame 列表 - How to create a list of data.frame with matched rows and columns in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM