简体   繁体   English

r中正则表达式中`\\\\ s | *`和`\\\\ s | [*]`的区别?

[英]the difference between `\\s|*` and `\\s|[*]` in regular expression in r?

What is the difference between \\\\s|* and \\\\s|[*] in regular expression in r? r中正则表达式中\\\\s|*\\\\s|[*]之间有什么区别?

> gsub('\\s|*','','Aug 2013*')
[1] "Aug2013*"
> gsub('\\s|[*]','','Aug 2013*')
[1] "Aug2013"

What is the function of [ ] here? [ ]的功能在这里是什么?

The first expression is invalid in the way you are using it, hence * is a special character. 第一个表达式在您使用它的方式中无效,因此*是一个特殊字符。 If you want to use sub or gsub this way with special characters, you can use fixed = TRUE parameter set. 如果要以特殊字符的方式使用subgsub ,可以使用fixed = TRUE参数集。

This takes the string representing the pattern being search for as it is and ignores any special characters. 这将获取表示正在搜索的模式的字符串,并忽略任何特殊字符。

See Pattern Matching and Replacement in the R documentation. 请参阅R文档中的Pattern Matching and Replacement

x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"

Your second expression is just using a character class [] for * to avoid escaping, the same as.. 第二个表达是在运用一个字符类[]*以避免逸出,同为..

x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"

As far as the explanation of your first expression: \\\\s|* 至于你的第一个表达的解释: \\\\s|*

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR

And the second expression: \\\\s|[*] 第二个表达式: \\\\s|[*]

\s      whitespace (\n, \r, \t, \f, and " ")
|       OR
[*]     any character of: '*'

The use of [] here is nothing else but to escape the * to a literal asterisk. 这里使用[]只不过是将*转换为文字星号。

The first regex is invalid ( * is special character meaning "zero or more"). 第一个正则表达式无效( *是特殊字符,表示“零或更多”)。

The second regex is equivalent to 第二个正则表达式相当于

'\\s|\\*'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM