[英]Using grep to filter rows with two or more patterns in the string in R
I need to index all the rows that have a string beginning with either "B-"
or "B^"
in one of the columns.我需要为其中一列中包含以
"B-"
或"B^"
开头的字符串的所有行建立索引。 I tried a bunch of combinations, but I am suspecting it might not be working due to "-" and "^" signs being part of grep command as well.我尝试了一堆组合,但我怀疑它可能不起作用,因为“-”和“^”标志也是 grep 命令的一部分。
dataset[grep('^(B-|B^)[^B-|B^]*$', dataset$Col1),]
With the above script, rows beginning with "B^"
are not being extracted.使用上述脚本,不会提取以
"B^"
开头的行。 Please suggest a smart way to handle this.请建议一个聪明的方法来处理这个问题。
您可以在grep
使用转义\\\\
命令:
dataset[grep('^(B\\-|B\\^)[^B\\-|B\\^]*$', dataset$Col1),]
For further explanation, the ^
matches the beginning of a string as an anchor therefore you have to escape it in the middle of string.为了进一步解释,
^
匹配字符串的开头作为锚点,因此您必须在字符串的中间将其转义。 The []
are a character class so [^B-|B^]*
matches any character that's not a B,-,B, or ^. []
是一个字符类,因此[^B-|B^]*
匹配任何不是 B、-、B 或 ^ 的字符。 They are unnecessary here.它们在这里是不必要的。
The simplified regex is: dataset[grep('^(B-|B\\\\^)', dataset$Col1),]
简化的正则表达式为:
dataset[grep('^(B-|B\\\\^)', dataset$Col1),]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.