简体   繁体   English

仅在正则表达式匹配项中替换字符

[英]Replace character in regex match only

I tried to search regex in textfile and than in match scope only replace one character by other. 我试图在文本文件中搜索正则表达式,然后在匹配范围中仅将一个字符替换为另一个字符。 My problem is, that I'm unable to do it by some simple way. 我的问题是,我无法通过一些简单的方法来做到这一点。

Example source file: 示例源文件:

...
 <br>
<a id="some shopitem" ref="#some shop item name 01 a" style="text-decoration:none;"><h3 style="background-color: #ccc;">blah blab hasdk sldk sasdas dasda sd</h3></a>
<table>
 <td width="500">
....

there I need to match regexp ref=\\"#[[:alnum:] ]*\\" (ref="#whatever name with spaces") and there replace spaces in match with "-", but of course do not change another spaces out or regex match. 那里我需要匹配regexp ref=\\"#[[:alnum:] ]*\\" (ref =“#任何带有空格的名称”),并用“-”替换匹配的空格,但是当然不要更改另一个空格或正则表达式匹配。

So result should looks like this: 因此结果应如下所示:

....
 <br>
<a id="some shopitem" href="#some-shop-item-name-01-a" style="text-decoration:none;"><h3 style="background-color: #ccc;">blah blab hasdk sldk sasdas dasda sd</h3></a>
<table>
 <td width="500">
....

Would it be even possible to do it without some sort of script just in one-line command in bash? 仅在bash中的单行命令中甚至不需要某种脚本就可以做到吗? Is there some way how to replace spaces in group? 有什么方法可以替换组中的空格吗? something like sed -rs/ref=\\"#([[:alnum:] ]*\\)/(\\1s/ /-/g)/g' ? sed -rs/ref=\\"#([[:alnum:] ]*\\)/(\\1s/ /-/g)/g'

A perl solution: Perl解决方案:

perl -pe 's/(ref="#)([\w\s]+)(")/ ($x,$y,$z)=($1,$2,$3); $y =~ s{\s}{-}g; $x.$y.$z /eg'

It's slightly more permissive about what can appear in the ref name (underscore, tab, some other whitespace chars) 关于引用名称中可能出现的内容(下划线,制表符和其他一些空白字符)稍许宽容

Would it be even possible to do it without some sort of script just in one-line command in bash? 仅在bash中的单行命令中甚至不需要某种脚本就可以做到吗?

Your question somehow triggered a burning ambition in me to do this...! 您的问题以某种方式激发了我的雄心壮志,以实现这一目标……!

varfile=SOURCEFILE && varsubstfile=RESULTFILE && IFS=' ' read -a repl <<< $(sed -r 's/(.*)(ref="#.*?")( .*)/\2/;tx;d;:x' $varfile | sed -e 's/\ /\-/g' | sed ':a;N;$!ba;s/\s/ /g') && for i in "${!repl[@]}"; do needle["$i"]=$(sed 's/\-/\ /g' <<< "${repl["$i"]}"); done && cp $varfile $varsubstfile && for i in "${!needle[@]}"; do sed -ir "s/${needle[i]}/${repl[i]}/g" $varsubstfile; done && unset needle && unset repl && less $varsubstfile && unset varfile && unset varsubstfile

SOURCEFILE is your sourcefile, RESULTFILE is the name of a file where the output gets written to, so change both of them according to your needs. SOURCEFILE是您的源文件, RESULTFILE是将输出写入输出的文件的名称,因此请根据需要更改它们。

Well... it is kind of a script, but it's a (damn huge) one-liner :) 好吧...这是一个脚本,但它是(该死的)单线:)

I supposed that there are more occurences of ref="#.*" in the whole file, otherwise it would have been much shorter (although I don't remember the shorter version anymore). 我以为整个文件中都会出现ref="#.*"的情况,否则它会更短(尽管我不记得更短的版本了)。

... and I really hope this works on your *nix-system :D ...而且我真的希望这可以在您的* nix-system上运行:D


Just in case you want to know what this thing does, here's an explanation: 万一您想知道这件事在做什么,这里有一个解释:

 varfile=SOURCEFILE && #set variable for the sourcefile varsubstfile=RESULTFILE && #set variable for the resultfile IFS=' ' read -a repl <<< #we're going to read multiple values into an array "repl" #delimited by a space $( #grab only the second capture group (ref="#.*?") sed -r 's/(.*)(ref="#.*?")( .*)/\\2/;tx;d;:x' $varfile | sed -e 's/\\ /\\-/g' | #replace every space in (ref="#.*?") with a dash sed ':a;N;$!ba;s/\\s/ /g' #replace newlines with a space #when there is more than one occurence sed will delimit them with a newline #but i set a space as the delimiter for the read operation, #thus the last replacement ) && #we now have every needed replacement-string in an array called "repl" for i in "${!repl[@]}"; do #iterate over every value in the array we just read needle["$i"]=$(sed 's/\\-/\\ /g' <<< "${repl["$i"]}"); #replace dashes with spaces and store in a new variable done && #and now every original string, the needle we are going to search for #is stored in another array cp $varfile $varsubstfile && #copy sourcefile to resultfile for i in "${!needle[@]}"; do #for every string we are going to replace sed -ir "s/${needle[i]}/${repl[i]}/g" $varsubstfile; #... we replace it! done #technically we're done here #but i like to clean up afterwards and show the result with less unset repl && less $varsubstfile && unset varfile && unset varsubstfile 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM