简体   繁体   中英

Finding a string pattern using grep

I'm trying to find a certain sequence in the text of several .txt files. I am looking for a string that is joined to a 4 digit number. eg Watson1990. I tested the regex using an online tester and it appeared to work, however the expression (or combinations of it) failed to produce an output on my files.

My regular expression is as follows:

egrep '\w*\d{4}' *.txt

However it does not produce any output. Can you tell me what is wrong with this? I'm using OSX (Snow Leopard).

Thanks.

The reason why your regular expression doesn't work is that in extended regular expression syntax the token \\d matches the letter d , not a digit. Use the character class [0-9] instead.

Also \\w matches digits as well as letters so you probably don't want to use it here. Use the character class [A-Za-z] to match letters in AZ or az.

I changed the * to a + because presumably you want at least one letter before the number. The + means "one or more", whereas * means "zero or more".

Finally you may wish to consider what should happen if you see a 5 digit number. Your regular expression currently accepts it because a 5 digit number starts with a 4 digit number.

In conclusion, try this:

egrep '[a-zA-Z]+[0-9]{4}' *.txt

Your regular expression uses Perl, not extended, regex components. Try

grep -P '\w\d{4}' *.txt

if your version of grep has that option. I'm using GNU grep 2.5.1 and the -P option is listed as "highly experimental".

GNU grep

grep -Po "(\w+\d{4})" file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM