简体   繁体   中英

sed not working as expected, removing special character from middle of string

I have a file 'test' with the contents:

sa!ve
hel!lo
te!st
te!ve
help!
please!

I'd like to remove any exclamation mark which is in between two lowercase letters. So the results should be:

save
hello
test
teve
help!
please!

I've tried cat test | sed 's/\\([:lower:]\\)\\!\\([:lower:]\\)/\\1\\2/g' cat test | sed 's/\\([:lower:]\\)\\!\\([:lower:]\\)/\\1\\2/g' and alpha/alphanum but strange, it's only working for the word 'hel!lo' and nothing else, my results have been:

sa!ve
hello
te!st
te!ve
help!
please!

Not sure why it's not working for the other words.

The problem is you're using the character class incorrectly. [:lower:] is the name of the character set , so you'd actually use it like so [[:lower:]] .

Therefore the correct sed expression is:

cat test |  sed 's/\([[:lower:]]\)\!\([[:lower:]]\)/\1\2/g'

Which works as expected.

Here's the output I get:

save
hello
test
teve
help!
please!

So you can think of [:lower:] as shorthand for az , so when creating a character on the fly, this becomes [[:lower:]] . It's a tricky one that a lot of people get bitten by the first couple of times around.

You are using character class so [:lower:] would any single character within the square bracket. In your input where only l (which is present in the character class :lower: ) is getting matched so that it's getting replaced.

Change it to character range [az] for matching any lower case alphabet within the range.

cat test | sed 's/\([a-z]\)\!\([a-z]\)/\1\2/g'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM