简体   繁体   中英

Bash shell(grep) equivalent of this python regular expression?

I have written a regular expression to match hyphenated word in python

regexp = r"[a-z]+(?:-[a-z]+)*"

It matches words with zero or more hyphens. For eg abc,acd-def,xyy etc. However, I can't find this grouping operator ?: for shell(for instance using with grep). It seems to me that this is a feature of python regex only not standard regex.

Can anyone please tell me how to write the same regex in shell?

(?:pattern) matches pattern without capturing the contents of the match. It is used with the following * to allow you to specify zero or more matches of the contents of the ( ) without creating a capture group. This affects the result in python if you used something like re.search() , as the MatchObject would not contain the part from the (?: ) . In grep, the result isn't return in the same way, so can just remove the ?: to use a normal group:

grep -E '[a-z]+(-[a-z]+)*' file

Here I'm using the -E switch to enable extended regular expression support. This will output each line matching the pattern - you can add the -o switch to only print the matching parts.

As mentioned in the comments (thanks), it is possible to use back-references (like \\1 ) with grep to refer to previous capture groups inside the pattern, so technically the behaviour is being changed slightly by removing the ?: , although this isn't something that you're doing at the moment so it doesn't really matter.

Your regular expression doesn't "match hyphenated word" - it matches words made up of [-az] where the first and last character must be in [az] . Ie it matches [az] (one-letter words) or [az][-az]*[az] .

Your question is ambiguous - bash normally deals with wildcard expressions ; grep can process regular expressions .

  • Bash

    This cannot be done with wilcards. You may use the =~ operator inside [[ ]] brackets: [[ $string =~ [az]|[az][-az]*[az] ]] .

  • Grep

    You can combine two regexes with | like so: [az]|[az][-az]*[az] .

Reading between the lines of your question - "to match hyphenated word" sounds more like you want a regexp like [az]+(-[az]+)+ so that there's at least one - in your match.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM