I have an input as follows:
INa.aa................... October 2010 after its previous US-based owners failed to pay debts
My goal is to put brackets around every word starting with letter i
/ I
. So I issued a command:
sed 's/\<i[^[:blank:]]*\>/(&)/gi' input_data
Which returned this output:
(INa.aa)................... October 2010 after (its) previous US-based owners failed to pay debts
What I don't get is, why doesn't the ^[:blank:]*
also include the dots after INa.aa
?
Thank you for any suggestions.
You use the \\>
"end of word" escape. A word boundary is defined as
the character to the left is a "word" character and the character to the right is a "non-word" character, or vice-versa
in the manual (referring to \\b
). In the case of \\>
, the "vice-versa" does not apply.
What is a "word" character?
A "word" character is any letter or digit or the underscore character.
And "non-word" are all the others. You expect the boundary between your periods and a blank to match \\>
, but it doesn't: both the period and the blank are non-word characters. The word boundary is between the last a
and the first .
.
The period between the a
s is also surrounded by word boundaries, but because there aren't any blanks involved, it's a part of the match.
If you want to match everything up to the next blank, you can just skip the \\>
in your regex.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.