简体   繁体   中英

^[:blank:] does not match dot in sed

I have an input as follows:

INa.aa................... October 2010 after its previous US-based owners failed to pay debts

My goal is to put brackets around every word starting with letter i / I . So I issued a command:

sed 's/\<i[^[:blank:]]*\>/(&)/gi' input_data

Which returned this output:

(INa.aa)................... October 2010 after (its) previous US-based owners failed to pay debts

What I don't get is, why doesn't the ^[:blank:]* also include the dots after INa.aa ?

Thank you for any suggestions.

You use the \\> "end of word" escape. A word boundary is defined as

the character to the left is a "word" character and the character to the right is a "non-word" character, or vice-versa

in the manual (referring to \\b ). In the case of \\> , the "vice-versa" does not apply.

What is a "word" character?

A "word" character is any letter or digit or the underscore character.

And "non-word" are all the others. You expect the boundary between your periods and a blank to match \\> , but it doesn't: both the period and the blank are non-word characters. The word boundary is between the last a and the first . .

The period between the a s is also surrounded by word boundaries, but because there aren't any blanks involved, it's a part of the match.

If you want to match everything up to the next blank, you can just skip the \\> in your regex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM