簡體   English   中英

根據列名稱從應用功能中排除列

[英]Exclude columns from apply function based on column name

我有一些數據:

df <- data.frame(v1 = c('word',NA,'word','word',NA,'word','word',NA,'word','word'), 
                 v1_open = c('word',NA,'word','word',NA,'word','word',NA,'word','word'),
                 v2 = c('word','word',NA,'word','word',NA,'word','word',NA,'word'), 
                 v2_open = c('word','word',NA,'word','word',NA,'word','word',NA,'word'))

我正在使用apply將包含NA的觀察值更改為包含1的其他觀察值:

df <- t(apply(df,1,function(x){
  ifelse(is.na(x) ,0,1)
}))

回報

      v1 v1_open v2 v2_open
 [1,]  1       1  1       1
 [2,]  0       0  1       1
 [3,]  1       1  0       0
 [4,]  1       1  1       1
 [5,]  0       0  1       1
 [6,]  1       1  0       0
 [7,]  1       1  1       1
 [8,]  0       0  1       1
 [9,]  1       1  0       0
[10,]  1       1  1       1

我想修改apply函數以排除名稱包含文本'_open'的列,從而導致:

      v1 v1_open v2 v2_open
 [1,]  1    word  1    word  
 [2,]  0    NA    1    word  
 [3,]  1    word  0    NA    
 [4,]  1    word  1    word  
 [5,]  0    NA    1    word  
 [6,]  1    word  0    NA    
 [7,]  1    word  1    word  
 [8,]  0    NA    1    word  
 [9,]  1    word  0    NA    
[10,]  1    word  1    word  

如何才能做到這一點?

能做:

library(dplyr)

df %>%
  mutate_at(vars(-contains("_open")),
            ~ +(!is.na(.)))

輸出:

   v1 v1_open v2 v2_open
1   1    word  1    word
2   0    <NA>  1    word
3   1    word  0    <NA>
4   1    word  1    word
5   0    <NA>  1    word
6   1    word  0    <NA>
7   1    word  1    word
8   0    <NA>  1    word
9   1    word  0    <NA>
10  1    word  1    word

我們可以將is.na直接應用於is.na列的子集,而無需任何循環,然后更新列

nm1 <- grep("_open", names(df), value = TRUE, invert = TRUE)
df[nm1] <- +(!is.na(df[nm1]))
df
#   v1 v1_open v2 v2_open
#1   1    word  1    word
#2   0    <NA>  1    word
#3   1    word  0    <NA>
#4   1    word  1    word
#5   0    <NA>  1    word
#6   1    word  0    <NA>
#7   1    word  1    word
#8   0    <NA>  1    word
#9   1    word  0    <NA>
#10  1    word  1    word

如果您的列在.*.*_open之間交替,那么您可以簡單地通過TRUE, FALSE對列進行子集化,即

df[c(TRUE, FALSE)] <- +(!is.na(df[c(TRUE, FALSE)]))

df
#   v1 v1_open v2 v2_open
#1   1    word  1    word
#2   0    <NA>  1    word
#3   1    word  0    <NA>
#4   1    word  1    word
#5   0    <NA>  1    word
#6   1    word  0    <NA>
#7   1    word  1    word
#8   0    <NA>  1    word
#9   1    word  0    <NA>
#10  1    word  1    word

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM