R-if-else应用于列表

Question

I am new to R, so it may be that some of concepts are not fully correct... I have a set of files that I read into a list (here just shown the first 3 lines of each): 我是R的新手，所以有些概念可能不完全正确。我有一组文件，我将它们读入列表中（此处仅显示了每个文件的前3行）：

myfiles<-lapply(list.files(".",pattern="tab",full.names=T),read.table,skip="#")
myfiles
[[1]]
       V1  V2         V3
1   10001  33 -0.0499469
2   30001  65  0.0991478
3   50001  54  0.1564400

[[2]]
       V1  V2        V3
1   10001  62 0.0855260
2   30001  74 0.1536640
3   50001  71 0.1020960

[[3]]
       V1  V2          V3
1   10001  49 -0.04661360
2   30001  65  0.16961500
3   50001  61  0.07089600

I want to apply an ifelse condition in order to substitute values in columns and then return exactly the same list. 我想应用ifelse条件以便替换列中的值，然后返回完全相同的列表。 However, when I do this: 但是，当我这样做时：

myfiles<-lapply(myfiles,function(x) ifelse(x$V2>50, x$V3, NA))
myfiles
[[1]]
 [1]         NA  0.0991478  0.1564400

[[2]]
 [1] 0.0855260 0.1536640 0.1020960

[[3]]
 [1]          NA  0.16961500  0.07089600

it does in fact what I want to, but returns only the columns where the function was applied, and I want it to return the same list as before, with 3 columns (but with the substitutions). 实际上，它确实实现了我想要的功能，但是只返回应用了该函数的列，并且我希望它返回与以前相同的列表，带有3列（但带有替换）。

I guess there should be an easy way to do this with some variant of "apply", but I was not able to find it or solve it. 我想应该有一种简单的方法可以使用“应用”的某种变体来做到这一点，但是我找不到或解决它。

Thanks 谢谢

Answer 1

Perhaps this helps 也许这有帮助

 lapply(myfiles,within, V3 <- ifelse(V2 >50, V3, NA))


 #[[1]]
 #    V1 V2        V3
 #1 10001 33        NA
 #2 30001 65 0.0991478
 #3 50001 54 0.1564400

 #[[2]]
 #    V1 V2       V3
 #1 10001 62 0.085526
 #2 30001 74 0.153664
 #3 50001 71 0.102096

#[[3]]
#     V1 V2       V3
#1 10001 49       NA
#2 30001 65 0.169615
#3 50001 61 0.070896

Update 更新资料

Another option would be to read the files using fread from data.table which would be fast 另一种办法是阅读使用文件fread从data.table这将是快

library(data.table)
files <- list.files(pattern='tab')
lapply(files, function(x) fread(x)[V2<=50,V3:=NA] )
#[[1]]
#     V1 V2        V3
#1: 10001 33        NA
#2: 30001 65 0.0991478
#3: 50001 54 0.1564400

#[[2]]
#     V1 V2       V3
#1: 10001 62 0.085526
#2: 30001 74 0.153664
#3: 50001 71 0.102096

#[[3]]
#     V1 V2       V3
#1: 10001 49       NA
#2: 30001 65 0.169615
#3: 50001 61 0.070896

Or as @Richie Cotton mentioned, you could also bind the datasets together using rbindlist and then do the operation in one step. 或者就像@Richie Cotton提到的那样，您也可以使用rbindlist将数据集绑定在一起，然后一步一步进行操作。

 library(tools)
 dt1 <- rbindlist(lapply(files, function(x) 
      fread(x)[,id:= basename(file_path_sans_ext(x))] ))[V2<=50, V3:=NA]

 dt1
 #     V1 V2        V3   id
 #1: 10001 33        NA tab1
 #2: 30001 65 0.0991478 tab1
 #3: 50001 54 0.1564400 tab1
 #4: 10001 62 0.0855260 tab2
 #5: 30001 74 0.1536640 tab2
 #6: 50001 71 0.1020960 tab2
 #7: 10001 49        NA tab3
 #8: 30001 65 0.1696150 tab3
 #9: 50001 61 0.0708960 tab3

Answer 2

You can use lapply and transform / within . 您可以使用lapply和transform / within 。 There are three possibilities: 有三种可能性：

a) ifelse a） ifelse

 lapply(myfiles, transform, V3 = ifelse(V2 > 50, V3, NA))

b) mathematical operators (potentially more efficient) b）数学运算符（可能更有效）
```
 lapply(myfiles, transform, V3 = NA ^ (V2 <= 50) * V3) 
```

c) is.na<- c） is.na<-

 lapply(myfiles, within, is.na(V3) <- V2 < 50)

The result 结果

[[1]]
     V1 V2        V3
1 10001 33        NA
2 30001 65 0.0991478
3 50001 54 0.1564400

[[2]]
     V1 V2       V3
1 10001 62 0.085526
2 30001 74 0.153664
3 50001 71 0.102096

[[3]]
     V1 V2       V3
1 10001 49       NA
2 30001 65 0.169615
3 50001 61 0.070896

Answer 3

This seems harder than it should be because you are working with a list of data frames rather than a single data frame. 这似乎比应该做的要难，因为您正在处理的是数据帧列表，而不是单个数据帧。 You can combine all the data frames into a single one using rbind_all in dplyr . 您可以使用rbind_all中的dplyr将所有数据帧组合为一个帧。

library(dplyr)
# Some variable renaming for clarity:
# myfiles now refers to the file names; mydata now contains the data
myfiles <- list.files(pattern="tab", full.names=TRUE) 
mydata <- lapply(myfiles, read.table, skip="#")

# Get the number of rows in each data frame
n_rows <- vapply(mydata, nrow, integer(1))
# Combine the list of data frames into a single data frame
all_mydata <- rbind_all(mydata)
# Add an identifier to see which data frame the row came from.
all_mydata$file <- rep(myfiles, each = n_rows)

# Now update column 3
is.na(all_mydata$V3) <- all_mydata$V2 < 50

Answer 4

Try adding an id column for each df and binding them together: 尝试为每个df添加一个id列并将其绑定在一起：

for(i in 1:3) myfiles[[i]]$id = i
ddf = myfiles[[1]]
for(i in 2:3) ddf = rbind(ddf, myfiles[[i]])

Then apply changes on composite df and split it back again: 然后在复合df上应用更改，然后再次将其拆分回：

ddf$V3 = ifelse(ddf$V2>50, ddf$V3, NA)
myfiles = lapply(split(ddf, ddf$id), function(x) x[1:3])

myfiles
$`1`
     V1 V2        V3
1 10001 33        NA
2 30001 65 0.0991478
3 50001 54 0.1564400

$`2`
      V1 V2       V3
11 10001 62 0.085526
21 30001 74 0.153664
31 50001 71 0.102096

$`3`
      V1 V2       V3
12 10001 49       NA
22 30001 65 0.169615
32 50001 61 0.070896

R-if-else应用于列表

问题描述

4 个解决方案

解决方案1
3 2014-11-13 10:24:21

Update 更新资料

解决方案2
2 已采纳 2014-11-13 10:25:06

解决方案3
1 2014-11-13 11:10:39

解决方案4
0 2014-11-13 12:05:06

R-if-else应用于列表

问题描述

4 个解决方案

解决方案1 3 2014-11-13 10:24:21

Update 更新资料

解决方案2 2 已采纳 2014-11-13 10:25:06

解决方案3 1 2014-11-13 11:10:39

解决方案4 0 2014-11-13 12:05:06

解决方案1
3 2014-11-13 10:24:21

解决方案2
2 已采纳 2014-11-13 10:25:06

解决方案3
1 2014-11-13 11:10:39

解决方案4
0 2014-11-13 12:05:06