簡體   English   中英

使用R從數據框中的PURRR刪除單詞

[英]Remove words using PURRR from dataframe in R

我已經看到了使用gsubsapply從數據框中刪除單詞的示例。

是否有使用purrr庫中的map的解決方案

library(purrr)
ID<-c(1,2)
Text_W<-c("I love vegetables, and some fruits","Can we meet tomorrow for a movies, and other fun activities") 
new_tab<-tibble(ID,Text_W)
remove_words<-c("love", "and")

我嘗試了這些但沒有成功:

#gsub from base
map_chr(new_tab$Text_W,~paste(gsub(remove_words,"")))


library(stringr)
#
map(new_tab$Text_W,~paste(str_replace_all(remove_words,"")))

任何幫助將不勝感激。

不必使用map 只需嘗試

new_tab %>% 
  mutate(Text_New=str_replace_all(Text_W, paste(remove_words,collapse = "|"),""))
# A tibble: 2 x 3
     ID Text_W                                                      Text_New                                                
  <dbl> <chr>                                                       <chr>                                                   
1    1. I love vegetables, and some fruits                          I  vegetables,  some fruits                             
2    2. Can we meet tomorrow for a movies, and other fun activities Can we meet tomorrow for a movies,  other fun activities

請注意,我使用or == |折疊了remove_words | 使用paste(remove_words,collapse = "|")

您主要錯過了. 引用函數參數:

> map_chr(new_tab$Text_W, ~gsub("love|and", "", .))
[1] "I  vegetables,  some fruits"                             
[2] "Can we meet tomorrow for a movies,  other fun activities"

還要注意gsub("love|and"而不是gsub(c("love","and")

編輯

如果要使用要刪除的單詞向量,而不是鍵入love|and ,請執行

map_chr(new_tab$Text_W, ~gsub(paste(remove_words, collapse="|"), "", .))

我會用以下一種方法來做,不會用purrr這件事

library(purrr)
library(dplyr)
library(stringr)

ID<-c(1,2)
Text_W<-c("I love vegetables, and some fruits","Can we meet tomorrow for a movies, and other fun activities") 
new_tab<-tibble(ID,Text_W)
remove_words<-c("love", "and")

# This is basic, if you are only doing it for one column, see Jimbou's note on collapse
new_tab %>% 
  mutate(Text_W = str_replace_all(Text_W, paste(remove_words,collapse = "|"),""))

# This is more scalable, as you can put other columns in the `vars()` method
new_tab %>% 
  mutate_at(vars(Text_W), str_replace_all, paste(remove_words, collapse = "|"), "")

# This is is scalable, but uses base R in case I didn't feel like having to load stringr
new_tab %>% 
  mutate_at(vars(Text_W), sub, 
            pattern = paste(remove_words, collapse = "|"), 
            replacement = "")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM