简体   繁体   中英

Applying conditional functions to complex data.frame in R

I have a data frame of data frames that looks like this:

> df
                               Var
1           word_1, word_2, word_3
2   word_1, word_2, word_3, word_4

> dput(df)
structure(list(df = list(structure(list(N = c("word_1", "word_2", "word_3")), 
.Names = "N", row.names = c(NA, -3L), class = "data.frame"), structure(list(N 
= c("word_1", "word_2", "word_3", "word_4")), 
.Names = "N", row.names = c(NA, -4L), class = "data.frame"))), .Names = "Var", 
row.names = c(NA, -2L), class = "data.frame") 

I want to apply a function to the data such that if a word matches a condition, it is replaced. I'm trying something like this:

func_1 <- function(dataset, condition){
require(data.table)
setDT(dataset)[, lapply(.SD, function(x) ifelse(x == condition, "A", x))]
}

df <- lapply(df, func_1, condition = "word_2")

But I get the error:

Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = 
nr,  : 
'df' must be of a vector type, was 'NULL'

I also need a function much like func_1 except that I want to be able to replace words where the condition occurs somewhere in the word. For example, func_2 would be such that any word containing a "_" is replaced by some character, say B . Any guidance would be much appreciated! Thanks :)

Here is a dplyr solution to your first problem:

condition <- "word_2"
library(dplyr)
mutate(df, Var = lapply(Var, mutate, N = ifelse(N == condition, "A", N)))
#                         Var
# 1         word_1, A, word_3
# 2 word_1, A, word_3, word_4

A translation in base R:

"$<-"(df, Var, lapply(df$Var, function(x)
  "$<-"(x, N, ifelse(x$N == condition, "A", x$N))
))

Since you seem to use data.table , I tried to carve some data.table equivalent but I'm not too familiar with the syntax so it might not be very idiomatic:

library(data.table)
DT <- as.data.table(df)
DT[, .(Var = list(as.data.table(Var)[, ifelse(N == condition, "A", N)])), by = seq_len(nrow(DT))]

For you second problem, it's a simple replacement of N == condition with grepl(condition, N) :

mutate(df, Var = lapply(Var, mutate, N = ifelse(grepl("_", N), "B", N)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM