簡體   English   中英

刪除 R 中不符合條件的所有行?

[英]Removing all rows that do not meet criteria in R?

我似乎無法在這里找到類似於我的場景的解決方案。 這是我的示例數據集中的一列:

How_do_you_feel

Excited, Hopeful, Prepared, good    
Unsure, confused, anxious, curious  
Co operations, Teamwork, communication, critical thinking   
a   
First, team work, nervous, curious  
Interesting. New. Exciting. Develop 
perplexed,anxious,embarrassed,bit excited 
Novel, Unknown, Challenging, Useful 
Worried, excited, self-doubt, motivated 
Excited,curious,nervous,worried

正確的格式應該是 4 個單詞,中間用逗號分隔,例如“ Excited, Hopeful, Prepared, good ”。

我如何清理我的數據以刪除所有格式錯誤的行,例如“有趣”。 新的。 令人興奮。 發展”或“困惑、焦慮、尷尬、有點興奮”?

所以結果看起來像這樣:

How_do_you_feel

Excited, Hopeful, Prepared, good    
Unsure, confused, anxious, curious  
Co operations, Teamwork, communication, critical thinking 
First, team work, nervous, curious
Novel, Unknown, Challenging, Useful 
Worried, excited, self-doubt, motivated 

謝謝!

這是一種潛在的解決方案:

library(tidyverse)

lines <- c("Excited, Hopeful, Prepared, good",
"Unsure, confused, anxious, curious",
"Co operations, Teamwork, communication, critical thinking",
"a",
"First, team work, nervous, curious",
"Interesting. New. Exciting. Develop",
"perplexed,anxious,embarrassed,bit excited",
"Novel, Unknown, Challenging, Useful",
"Worried, excited, self-doubt, motivated",
"Excited,curious,nervous,worried")

df <- data.frame(How_do_you_feel = lines)
df
#>                                              How_do_you_feel
#> 1                           Excited, Hopeful, Prepared, good
#> 2                         Unsure, confused, anxious, curious
#> 3  Co operations, Teamwork, communication, critical thinking
#> 4                                                          a
#> 5                         First, team work, nervous, curious
#> 6                        Interesting. New. Exciting. Develop
#> 7            perplexed,anxious,embarrassed,bit excited
#> 8                        Novel, Unknown, Challenging, Useful
#> 9                    Worried, excited, self-doubt, motivated
#> 10                           Excited,curious,nervous,worried

df %>%
  mutate(How_do_you_feel = str_extract(
    How_do_you_feel,
    "[[:alpha:][:punct:] ]+, [[:alpha:][:punct:] ]+, [[:alpha:][:punct:] ]+, [[:alpha:][:punct:] ]+"
    )) %>%
  filter(!is.na(How_do_you_feel))
#>                                             How_do_you_feel
#> 1                          Excited, Hopeful, Prepared, good
#> 2                        Unsure, confused, anxious, curious
#> 3 Co operations, Teamwork, communication, critical thinking
#> 4                        First, team work, nervous, curious
#> 5                       Novel, Unknown, Challenging, Useful
#> 6                   Worried, excited, self-doubt, motivated

reprex 包於 2022-07-22 創建 (v2.0.1)

一個似乎適用於您的情況的通用規則是,三個逗號后跟一個空格(而不僅僅是前面答案中的逗號)意味着一個很好的匹配。 嘗試這個:

library(tidyverse)

read_delim("How_do_you_feel

Excited, Hopeful, Prepared, good    
Unsure, confused, anxious, curious  
Co operations, Teamwork, communication, critical thinking   
a   
First, team work, nervous, curious
Interesting. New. Exciting. Develop 
perplexed,anxious,embarrassed,bit excited 
Novel, Unknown, Challenging, Useful 
Worried, excited, self-doubt, motivated 
Excited,curious,nervous,worried", delim = "\\n") %>%
  mutate(How_do_you_feel = str_trim(How_do_you_feel)) %>%
  filter(str_detect(How_do_you_feel, paste("^", paste(rep("[[:alpha:]- ]+", times = 4), collapse = ", "), "$", sep = "")))

#   How_do_you_feel                                               
#   <chr>                                                         
# 1 "Excited, Hopeful, Prepared, good    "                        
# 2 "Unsure, confused, anxious, curious  "                        
# 3 "Co operations, Teamwork, communication, critical thinking   "
# 4 "First, team work, nervous, curious  "                        
# 5 "Novel, Unknown, Challenging, Useful "                        
# 6 "Worried, excited, self-doubt, motivated "  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM