[英]Applying conditional functions to complex data.frame in R
I have a data frame of data frames that looks like this: 我有一个看起来像这样的数据帧的数据帧:
> df
Var
1 word_1, word_2, word_3
2 word_1, word_2, word_3, word_4
> dput(df)
structure(list(df = list(structure(list(N = c("word_1", "word_2", "word_3")),
.Names = "N", row.names = c(NA, -3L), class = "data.frame"), structure(list(N
= c("word_1", "word_2", "word_3", "word_4")),
.Names = "N", row.names = c(NA, -4L), class = "data.frame"))), .Names = "Var",
row.names = c(NA, -2L), class = "data.frame")
I want to apply a function to the data such that if a word matches a condition, it is replaced. 我想对数据应用一个函数,以便如果单词与条件匹配,则将其替换。 I'm trying something like this:
我正在尝试这样的事情:
func_1 <- function(dataset, condition){
require(data.table)
setDT(dataset)[, lapply(.SD, function(x) ifelse(x == condition, "A", x))]
}
df <- lapply(df, func_1, condition = "word_2")
But I get the error: 但是我得到了错误:
Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow =
nr, :
'df' must be of a vector type, was 'NULL'
I also need a function much like func_1
except that I want to be able to replace words where the condition occurs somewhere in the word. 我还需要一个类似于
func_1
的函数,除了我希望能够替换条件出现在单词中某处的单词。 For example, func_2
would be such that any word containing a "_"
is replaced by some character, say B
. 例如,
func_2
将使得包含"_"
任何单词被某个字符(例如B
)替换。 Any guidance would be much appreciated! 任何指导将不胜感激! Thanks :)
谢谢 :)
Here is a dplyr
solution to your first problem: 这是您第一个问题的
dplyr
解决方案:
condition <- "word_2"
library(dplyr)
mutate(df, Var = lapply(Var, mutate, N = ifelse(N == condition, "A", N)))
# Var
# 1 word_1, A, word_3
# 2 word_1, A, word_3, word_4
A translation in base R: R基数的翻译:
"$<-"(df, Var, lapply(df$Var, function(x)
"$<-"(x, N, ifelse(x$N == condition, "A", x$N))
))
Since you seem to use data.table
, I tried to carve some data.table
equivalent but I'm not too familiar with the syntax so it might not be very idiomatic: 由于您似乎使用了
data.table
,所以我尝试雕刻一些data.table
等效项,但是我对语法不太熟悉,因此它可能不是很惯用:
library(data.table)
DT <- as.data.table(df)
DT[, .(Var = list(as.data.table(Var)[, ifelse(N == condition, "A", N)])), by = seq_len(nrow(DT))]
For you second problem, it's a simple replacement of N == condition
with grepl(condition, N)
: 对于您的第二个问题,这是用
grepl(condition, N)
简单替换N == condition
:
mutate(df, Var = lapply(Var, mutate, N = ifelse(grepl("_", N), "B", N)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.