[英]Filter vector elements containing and not containing multiple strings
根據此鏈接中的代碼,我們可以找到包含多個字符串的文件名:
allpatterns <- function(fnames, patterns) {
i <- sapply(fnames, function(fn) all(sapply(patterns, grepl, fn)) )
fnames[i]
}
filenames <- c("foo.txt", "bar.R", "foo_quux.py", "quux.c", "quux.foo",
"foo_bar", "bar.foo.cpp", "foo_bar_quux", "quux_foo.bar", "nothing")
allpatterns(filenames, c("foo", "bar"))
# [1] "foo_bar" "bar.foo.cpp" "foo_bar_quux" "quux_foo.bar"
現在我想通過添加一個不包含某些字符串的條件來進一步 go ,例如我希望過濾包含foo
, bar
並且不包含cpp
, quux
的文件名,這將給出以下結果:
# [1] "foo_bar"
我怎樣才能通過修改上面的代碼來實現呢?
編輯:下面專門針對 R 大師的回答,即使我沒有得到確切的預期結果,這也很鼓舞人心:
filenames <- c("foo.txt", "bar.R", "foo_quux.py", "quux.c", "quux.foo",
"foo_bar", "bar.foo.cpp", "foo_bar_quux", "quux_foo.bar",
"nothing")
keep <- c("foo", "bar")
drop <- c("cpp", "quux")
paste0('', paste0(keep, collapse = ''))
keep_regex <- paste0("\\b(?:", paste(keep, collapse="|"), ")\\b")
drop_regex <- paste0("\\b(?:", paste(drop, collapse="|"), ")\\b")
result <- filenames[grepl(keep_regex, filenames) &
!grepl(drop_regex, filenames)]
result
沒有“cpp”和“quux”的“foo”或“bar”:
filenames[grepl("foo|bar",filenames)&!grepl("cpp|quux",filenames)]
[1] "foo.txt" "bar.R" "foo_bar"
沒有“cpp”和“quux”的“foo”和“bar”:
filenames[grepl("(?=.*foo)(?=.*bar)",filenames,perl = T)&!grepl("cpp|quux",filenames)]
[1] "foo_bar"
也許這個 function 會有所幫助:
allpatterns <- function(fnames, keep, remove) {
# Include if it contains all the `keep` variables
i <- Reduce(`&`, lapply(keep, function(x) grepl(x, fnames)))
# Drop if any of `remove` variable is present.
j <- !Reduce(`|`, lapply(remove, function(x) grepl(x, fnames)))
fnames[i & j]
}
allpatterns(filenames, c("foo", "bar"), c("cpp", "quux"))
#[1] "foo_bar"
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.