簡體   English   中英

如何去除R中的無用字符?

[英]How to remove useless characters in R?

我有一個如下所示的數據集,如何刪除“#number”?

df>
terms                             year
5;#Remote Production;#10;         2021
53;#=Product-Category:Routing     2021
30;#HDR;#5;#Remote Production     2020
...

我需要它是這樣的:

df>
terms                          year
#Remote Production             2021
#Product-Category:Routing      2021
#HDR;#Remote Production     2020
...

開頭沒有#的數字也需要去掉

str_remove的一個選項

library(stringr)
library(dplyr)
df %>%
   mutate(terms = str_c('#', str_remove_all(terms, "^\\d+;#\\=?|#\\d+;")))

-輸出

#                     terms year
#1       #Remote Production; 2021
#2 #Product-Category:Routing 2021
#3   #HDR;#Remote Production 2020

數據

df <- structure(list(terms = c("5;#Remote Production;#10;", "53;#=Product-Category:Routing", 
"30;#HDR;#5;#Remote Production"), year = c(2021L, 2021L, 2020L
)), class = "data.frame", row.names = c(NA, -3L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM