简体   繁体   中英

Using grep to filter rows with two or more patterns in the string in R

I need to index all the rows that have a string beginning with either "B-" or "B^" in one of the columns. I tried a bunch of combinations, but I am suspecting it might not be working due to "-" and "^" signs being part of grep command as well.

dataset[grep('^(B-|B^)[^B-|B^]*$', dataset$Col1),]

With the above script, rows beginning with "B^" are not being extracted. Please suggest a smart way to handle this.

您可以在grep使用转义\\\\命令:

dataset[grep('^(B\\-|B\\^)[^B\\-|B\\^]*$', dataset$Col1),]

For further explanation, the ^ matches the beginning of a string as an anchor therefore you have to escape it in the middle of string. The [] are a character class so [^B-|B^]* matches any character that's not a B,-,B, or ^. They are unnecessary here.

The simplified regex is: dataset[grep('^(B-|B\\\\^)', dataset$Col1),]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM