简体   繁体   中英

regex match is not as i expect

In the below given Matches: Why first match is giving that output... and in second match why '-' is not matched with the target??

Index: 01234567890123456789012345678
Target: whois who? Who me? Who else?
Pattern: (^[a-z])|(\?$)
Match: (0,0:w)(28,28:?)

Index: 01234567890
Target: \-^$.?*+()|
Pattern: [\\-^$.?*+()|]
Match: (0,0:\)(2,2:^)(3,3:$)(4,4:.)(5,5:?)(6,6:*)(7,7:+)(8,8:()(9,9:))(10,10:|)

Edit:-

Thanks for asking the code

Please find the code here : http://paste.ubuntu.com/11831819/

The first one matches any character at the start of the line/string ^[az] and the question mark at the end of the line/string \\?$ , that is because of

  • ^ means start of the line
  • $ means end of the line

In the second one, the [] means to match characters in the set, and the - inside that means "between", so match characters whose ascii value is between \\ (having ascii value 92) and ^ (having ascii value 94), or one of $.*+()| . Since the ascii value of - is 46, it will not be displayed.

To solve your problem you should quote the -

[\\\-^$.?*+()|]

or put it at the end

[\\^$.?*+()|-]

. Of course this is bash but:

echo 'begin []\^$.?*+()|- end' | sed -e 's/[][\\^$.?*+()|-]/x/g'
begin xxxxxxxxxxxxx end

All special characters have been replaced by x while I only quoted \\ , because all other chars are placed right. If I move the - or the [ or the ] I have to quote them too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM