简体   繁体   English

将条件函数应用于R中的复杂data.frame

[英]Applying conditional functions to complex data.frame in R

I have a data frame of data frames that looks like this: 我有一个看起来像这样的数据帧的数据帧:

> df
                               Var
1           word_1, word_2, word_3
2   word_1, word_2, word_3, word_4

> dput(df)
structure(list(df = list(structure(list(N = c("word_1", "word_2", "word_3")), 
.Names = "N", row.names = c(NA, -3L), class = "data.frame"), structure(list(N 
= c("word_1", "word_2", "word_3", "word_4")), 
.Names = "N", row.names = c(NA, -4L), class = "data.frame"))), .Names = "Var", 
row.names = c(NA, -2L), class = "data.frame") 

I want to apply a function to the data such that if a word matches a condition, it is replaced. 我想对数据应用一个函数,以便如果单词与条件匹配,则将其替换。 I'm trying something like this: 我正在尝试这样的事情:

func_1 <- function(dataset, condition){
require(data.table)
setDT(dataset)[, lapply(.SD, function(x) ifelse(x == condition, "A", x))]
}

df <- lapply(df, func_1, condition = "word_2")

But I get the error: 但是我得到了错误:

Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = 
nr,  : 
'df' must be of a vector type, was 'NULL'

I also need a function much like func_1 except that I want to be able to replace words where the condition occurs somewhere in the word. 我还需要一个类似于func_1的函数,除了我希望能够替换条件出现在单词中某处的单词。 For example, func_2 would be such that any word containing a "_" is replaced by some character, say B . 例如, func_2将使得包含"_"任何单词被某个字符(例如B )替换。 Any guidance would be much appreciated! 任何指导将不胜感激! Thanks :) 谢谢 :)

Here is a dplyr solution to your first problem: 这是您第一个问题的dplyr解决方案:

condition <- "word_2"
library(dplyr)
mutate(df, Var = lapply(Var, mutate, N = ifelse(N == condition, "A", N)))
#                         Var
# 1         word_1, A, word_3
# 2 word_1, A, word_3, word_4

A translation in base R: R基数的翻译:

"$<-"(df, Var, lapply(df$Var, function(x)
  "$<-"(x, N, ifelse(x$N == condition, "A", x$N))
))

Since you seem to use data.table , I tried to carve some data.table equivalent but I'm not too familiar with the syntax so it might not be very idiomatic: 由于您似乎使用了data.table ,所以我尝试雕刻一些data.table等效项,但是我对语法不太熟悉,因此它可能不是很惯用:

library(data.table)
DT <- as.data.table(df)
DT[, .(Var = list(as.data.table(Var)[, ifelse(N == condition, "A", N)])), by = seq_len(nrow(DT))]

For you second problem, it's a simple replacement of N == condition with grepl(condition, N) : 对于您的第二个问题,这是用grepl(condition, N)简单替换N == condition

mutate(df, Var = lapply(Var, mutate, N = ifelse(grepl("_", N), "B", N)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM