I would like to make string substitution in a non-greedy match fashion
Remove all leading and trailing dashes, apostrophes (when these symbols are found in the middle of the word, they must be preserved)
Transform multiple spaces into 1 space
Example:
--ONE Tw'o-- -333- -'FO-UR'
must become
ONE Tw'o 333 FO-UR
I cannot get exactly the result. Can you please help me to correct my perl and sed syntax below?
$ echo "--ONE Tw'o-- -333- -'FO-UR'" \
| perl -pe "s/[-']+(.+?)/\1/g" \
| perl -pe "s/(.+?)[-']+/\1/g" \
| perl -pe "s/\s+/ /g"
Result (perl): "ONE Two 333 FOUR"
$ echo "--ONE Tw'o-- -333- -'FO-UR'" \
| sed -r -e "s/[-']+(.+?)/\1/g" \
-e "s/(.+)[-']+/\1/g" \
-e "s/\s+/ /g"
Result (sed): "ONE Tw'o-- -333- -'FO-UR"
Here's the perl version:
echo "--ONE Tw'o-- -333- -'FO-UR'" | perl -ne "s|-'||g; s|'-||g; s|^'||; s|'$||; s|^-+||; s|-+$||; s|-+\s+| |g; s|\s+-+| |g; s|\s+| |g; s|\s+$||; print;"
ONE Tw'o 333 FO-UR
The sed version is basically identical:
echo "--ONE Tw'o-- -333- -'FO-UR'" | sed -r -e "s|-'||g; s|'-||g; s|^'||; s|'$||; s|^-+||; s|-+$||; s|-+\s+| |g; s|\s+-+| |g; s|\s+| |g; s|\s+$||;"
ONE Tw'o 333 FO-UR
Annotations for the regular expressions used:
s|-'||g; # Remove dash followed by quote everywhere
s|'-||g; # Remove quote followed by dash everywhere
s|^'||; # Remove leading quote
s|'$||; # Remove trailing quote
s|^-+||; # Remove leading dash characters
s|-+$||; # Remove trailing dash characters
s|-+\s+| |g; # Replace dash characters followed by whitespace with 1 space everywhere
s|\s+-+| |g; # Replace whitespace followed by dash characters with 1 space everywhere
s|\s+| |g; # Replace multiple spaces with 1 space
s|\s+$||; # Remove trailing spaces
It is easy using lookarounds in perl
:
s='"asd,f",,,"as,df","asdf"asdf"'
perl -pe 's/(?<!\w)-|-(?!\w)//g' <<< "$s"
ONE Tw'o 333 'FO-UR'
(?<!\w)- # Lookbehind meaning match - if not preceded by a word character
| # regex alternation
(?!\w)- # Lookahead meaning match - if not followed by a word character
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.