[英]Replace a specific character at any word's begin and end in bash
I need to remove the hyphen '-' character only when it matches the pattern 'space-[AZ]' or '[AZ]-space'.仅当它与模式“space-[AZ]”或“[AZ]-space”匹配时,我才需要删除连字符“-”字符。 (Assuming all letters are uppercase, and space could be a space, or newline)
(假设所有字母都是大写,空格可以是空格或换行符)
sample.txt样本.txt
I AM EMPTY-HANDED AND I- WA-
-ANT SOME COO- COOKIES
I want the output to be我希望输出是
I AM EMPTY-HANDED AND I WA
ANT SOME COO COOKIES
I've looked around for answers using sed and awk and perl, but I could only find answers relating to removing all characters between two patterns or specific strings, but not a specific character between [AZ] and space.我已经使用 sed 和 awk 以及 perl 四处寻找答案,但我只能找到与删除两个模式或特定字符串之间的所有字符有关的答案,而不是 [AZ] 和空格之间的特定字符。
Thanks heaps!!谢谢堆!!
If perl
is your option, would you try the following:如果
perl
是您的选择,您会尝试以下操作吗:
perl -pe 's/(^|(?<=\s))-(?=[A-Z])//g; s/(?<=[A-Z])-((?=\s)|$)//g' sample.txt
(?<=\\s)
is a zero-width lookbehind assertion which matches leading whitespace without including it in the matched substring. (?<=\\s)
是一个零宽度的后视断言,它匹配前导空格而不将其包含在匹配的子字符串中。(?=[AZ])
is a zero-width lookahead assertion which matches trailing character between A and Z without including it in the matched substring. (?=[AZ])
是一个零宽度先行断言,它匹配 A 和 Z 之间的尾随字符,而不将其包含在匹配的子字符串中。s/..//g
is the flipped version of the first one.s/..//g
是第一个语句s/..//g
翻转版本。Could you please try following.你能不能试试以下。
awk '{for(i=1;i<=NF;i++){if($i ~ /^-[a-zA-Z]+$|^[a-zA-Z]+-$/){sub(/-/,"",$i)}}} 1' Input_file
Adding a non-one liner form of solution:添加非单衬形式的溶液:
awk '
{
for(i=1;i<=NF;i++){
if($i ~ /^-[a-zA-Z]+$|^[a-zA-Z]+-$/){
sub(/-/,"",$i)
}
}
}
1
' Input_file
Output will be as follows.输出如下。
I AM EMPTY-HANDED AND I WA
ANT SOME COO COOKIES
If you can provide Extended Regular Expressions to sed
(generally with the -E
or -r
option), then you can shorten your sed
expression to:如果您可以为
sed
提供扩展正则表达式(通常使用-E
或-r
选项),那么您可以将sed
表达式缩短为:
sed -E 's/(^|\s)-(\w)/\1\2/g;s/(\w)-(\s|$)/\1\2/g' file
Where the basic form is sed -E 's/find1/replace1/g;s/find2/replace2/g' file
which can also be written as separate expressions sed -E -e 's/find1/replace1/g' -e 's/find2/replace2/g'
(your choice).基本形式是
sed -E 's/find1/replace1/g;s/find2/replace2/g' file
,也可以写成单独的表达式sed -E -e 's/find1/replace1/g' -e 's/find2/replace2/g'
(您的选择)。
The details of s/find1/replace1/g
are: s/find1/replace1/g
的详细信息是:
find1
is find1
是
(^|\\s)
locate and capture at the beginning or whitespace, (^|\\s)
定位并捕获开头或空格,'-'
hyphen,'-'
连字符,\\w
(word-character);\\w
(word-character); andreplace1
is simply \\1\\2
reinsert both captures with the first two backreferences. replace1
只是\\1\\2
使用前两个反向引用重新插入两个捕获。 The next substitution expression is similar, except now you are looking for the hyphen followed by a whitespace or at the end.下一个替换表达式是类似的,除了现在您要查找的是连字符后跟一个空格或末尾。 So you have:
所以你有了:
find2
being find2
是
\\w
(word-character), \\w
(字字符)的捕获,(\\s|$)
, then(\\s|$)
,然后replace2
is the same as before, just reinsert the captured characters using backreferences. replace2
和以前一样,只是使用反向引用重新插入捕获的字符。 In each case the g
indicates a global replace of all occurrences.在每种情况下,
g
表示所有出现的全局替换。
( note: the \\w
word-character also includes the '_'
(underscore), so while unlikely you would have a hyphen and underscore together, if you do, you need to use the [A-Za-z]
list instead of \\w
) (注意:
\\w
单词字符还包括'_'
(下划线),因此虽然您不太可能将连字符和下划线放在一起,但如果您这样做,则需要使用[A-Za-z]
列表而不是\\w
)
Example Use/Output示例使用/输出
In your case, then output is:在你的情况下,输出是:
$ sed -E 's/(^|\s)-(\w)/\1\2/g;s/(\w)-(\s|$)/\1\2/g' file
I AM EMPTY-HANDED AND I WA
ANT SOME COO COOKIES
remove the hyphen '-' character only when it matches the pattern 'space-[AZ]' or '[AZ]-space'.
仅当它与模式 'space-[AZ]' 或 '[AZ]-space' 匹配时,才删除连字符 '-' 字符。 Assuming all letters are uppercase, and space could be a space, or newline
假设所有字母都是大写,空格可以是空格或换行符
It's:它的:
sed 's/\( \|^\)-\([A-Z]\)/\1\2/g; s/\([A-Z]\)-\( \|$\)/\1\2/g'
s
- substitute s
- 替代
/
\\( \\|^\\)
- space or beginning of the line \\( \\|^\\)
- 空格或行首-
- hyphen... -
- 连字符...\\(AZ]\\)
- a single upper case character \\(AZ]\\)
- 单个大写字符/
\\1\\2
- The \\1
is replaced by the first \\(...\\)
thing. \\1\\2
- \\1
被第一个\\(...\\)
替换。 So it is replaced by a space or nothing.\\2
is replaced by the single upper case character found. \\2
被找到的单个大写字符替换。 Effectively -
is removed.-
被删除。/
g
apply the regex globally g
全局应用正则表达式;
- separate two s
commands s
命令分开s
$
means end of the line. $
表示行尾。awk '{sub(/ -/,"");sub(/^-|-$/,"");sub(/- /," ")}1' file
I AM EMPTY-HANDED AND I WA
ANT SOME COO COOKIES
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.