简体   繁体   English

从字符串中删除所有以“@”开头的单词

[英]remove all words that start with “@” from a string

how can I remove all words that start with "@" from a string? 如何从字符串中删除所有以“@”开头的单词?

For example, "@AgnezMo On @AirAsia Airbus A320-216 Fleet with @NinetologyMY Livery -- 9M-AHG cc: @AgnesMonicaEnt @agnezone http://t.co/hfXwUQq2Oq " 例如,“@AgnezMo On @AirAsia Airbus A320-216 Fleet with @NinetologyMY Livery - 9M-AHG cc:@AgnesMonicaEnt @agnezone http://t.co/hfXwUQq2Oq

I would like to have the string to be "On Airbus A320-216 Fleet with Livery -- 9M-AHG cc: http://t.co/hfXwUQq2Oq " 我希望这个字符串是“On Airbus A320-216 Fleet with Livery - 9M-AHG cc: http//t.co/hfXwUQq2Oq

Try this where s is the input: 尝试这个,其中s是输入:

gsub("@\\w+ *", "", s)

giving: 赠送:

"On Airbus A320-216 Fleet with Livery -- 9M-AHG cc: http://t.co/hfXwUQq2Oq"

You can use Regex with R via the sub call, as described here . 您可以通过与R中,使用正则表达式sub调用,描述在这里

The regular expression to match those would be: @\\w+\\s+ . 与之匹配的正则表达式为: @\\w+\\s+

Hi you can do like this : 嗨你可以这样做:

xx <-  "@AgnezMo On @AirAsia Airbus A320-216 Fleet with @NinetologyMY Livery -- 9M-AHG cc: @AgnesMonicaEnt @agnezone http://t.co/hfXwUQq2Oq"
gsub("@([a-zA-Z0-9]|[_])*", "", xx)

## [1] " On  Airbus A320-216 Fleet with  Livery -- 9M-AHG cc:   http://t.co/hfXwUQq2Oq"

Assuming str is the string, 假设str是字符串,

> gsub("@[A-Za-z]+ ", "", str)
# [1] "On Airbus A320-216 Fleet with Livery -- 9M-AHG cc: http://t.co/hfXwUQq2Oq"

I would use character classes with the str_replace_all function from the stringr package: 我将使用字符串包中的str_replace_all函数的字符类:

usercomment <- c("@AgnezMo On @AirAsia Airbus A320-216 Fleet with @NinetologyMY Livery -- 9M-AHG cc: @AgnesMonicaEnt @agnezone")

library(stringr)
test <- str_replace_all(usercomment,"[:punct:]","")
test

You can also string together different character classes using the or operator, so you can replace characters and space with one line. 您还可以使用or运算符将不同的字符类串在一起,这样就可以用一行替换字符和空格。 Look at this code below to modify the column names of a dataframe to clean it up: 请查看下面的代码以修改数据帧的列名以进行清理:

> colnames(order_table) <- str_replace_all(colnames(order_table),"[:punct:]|[:space:]","")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM