简体   繁体   English

Perl和Sed字符串替换中的几个表达式

[英]Perl & Sed string substitute in several expressions

I would like to make string substitution in a non-greedy match fashion 我想以非贪婪的匹配方式进行字符串替换

  • Remove all leading and trailing dashes, apostrophes (when these symbols are found in the middle of the word, they must be preserved) 删除所有前导和尾随破折号,撇号(当在单词中间找到这些符号时,必须保留它们)

  • Transform multiple spaces into 1 space 将多个空间转换为1个空间

Example: 例:

--ONE   Tw'o--   -333-   -'FO-UR'

must become 必须成为

ONE Tw'o 333 FO-UR

I cannot get exactly the result. 我无法确切得到结果。 Can you please help me to correct my perl and sed syntax below? 您能帮我纠正下面的Perl和sed语法吗?

$ echo "--ONE   Tw'o--   -333-   -'FO-UR'" \
  | perl -pe "s/[-']+(.+?)/\1/g"           \
  | perl -pe "s/(.+?)[-']+/\1/g"           \
  | perl -pe "s/\s+/ /g"

Result (perl): "ONE Two 333 FOUR"

$ echo "--ONE   Tw'o--   -333-   -'FO-UR'" \
  | sed -r -e "s/[-']+(.+?)/\1/g"          \
    -e "s/(.+)[-']+/\1/g"                  \
    -e "s/\s+/ /g"

Result (sed): "ONE Tw'o-- -333- -'FO-UR"

Here's the perl version: 这是perl版本:

echo "--ONE   Tw'o--   -333-   -'FO-UR'" | perl -ne "s|-'||g; s|'-||g; s|^'||; s|'$||; s|^-+||; s|-+$||; s|-+\s+| |g; s|\s+-+| |g; s|\s+| |g; s|\s+$||; print;"

ONE Tw'o 333 FO-UR

The sed version is basically identical: sed版本基本上是相同的:

echo "--ONE   Tw'o--   -333-   -'FO-UR'" | sed -r -e "s|-'||g; s|'-||g; s|^'||; s|'$||; s|^-+||; s|-+$||; s|-+\s+| |g; s|\s+-+| |g; s|\s+| |g; s|\s+$||;"

ONE Tw'o 333 FO-UR

Annotations for the regular expressions used: 使用的正则表达式的注释:

s|-'||g;     # Remove dash followed by quote everywhere
s|'-||g;     # Remove quote followed by dash everywhere
s|^'||;      # Remove leading quote
s|'$||;      # Remove trailing quote
s|^-+||;     # Remove leading dash characters
s|-+$||;     # Remove trailing dash characters
s|-+\s+| |g; # Replace dash characters followed by whitespace with 1 space everywhere
s|\s+-+| |g; # Replace whitespace followed by dash characters with 1 space everywhere
s|\s+| |g;   # Replace multiple spaces with 1 space
s|\s+$||;    # Remove trailing spaces

It is easy using lookarounds in perl : perl使用环视很容易:

s='"asd,f",,,"as,df","asdf"asdf"'
perl -pe 's/(?<!\w)-|-(?!\w)//g' <<< "$s"
ONE Tw'o 333 'FO-UR'

(?<!\w)- # Lookbehind meaning match - if not preceded by a word character
|        # regex alternation
(?!\w)-  # Lookahead meaning match - if not followed by a word character

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM