如何刪除包含R中某些單詞的數據框中的行？

Question

我正在嘗試刪除數據框中包含某個單詞或某些單詞序列的行。 例如：

mydf <- as.data.frame(read.xlsx("C:\\data.xlsx, 1, header=T"))
head(df)
#     NO    ARTICLE    
# 1   34    New York Times reports blabla
# 2   42    Financial Times reports blabla
# 3   21    Greenwire reports blabla
# 4    3    New York Times reports blabla
# 5   46    Newswire reports blabla

我想從我的data.frame刪除包含字符串“New York Times”和“Newswire”的data.frame 。 我嘗試過使用%in%或grep不同方法，但我不太清楚如何使用它！

我怎么做？

Answer 1

根據我的評論，使用grepl ，它會在向量中找到指定的字符串時返回邏輯值。 在你的情況下，像：

df[!grepl('New York Times',df$Article),]

應該做的伎倆。

Answer 2

# Sample Data
NO <- c(34, 42, 21, 3)
ARTICLE <- c('New York Times reports blah blah fake news',
             'Financial Times blah blah',
             'Fox News has been very nice to me',
             'Newswire reports blah blah')
df <- data.frame(NO, ARTICLE)

# Create List of Exclusion Phrases
fakenews <- c('New York Times', 'Newswire')

# Exclude
very.nice.to.me <- df[ !grepl(paste(fakenews, collapse="|"), df$ARTICLE),]

如何刪除包含R中某些單詞的數據框中的行？

問題描述

2 個解決方案

解決方案1
3 2014-03-02 17:23:36

解決方案2
0 2017-07-25 18:57:42

如何刪除包含R中某些單詞的數據框中的行？

問題描述

2 個解決方案

解決方案1 3 2014-03-02 17:23:36

解決方案2 0 2017-07-25 18:57:42

解決方案1
3 2014-03-02 17:23:36

解決方案2
0 2017-07-25 18:57:42