简体   繁体   English

复杂的sed多行匹配和替换

[英]complex sed multiline match and replace

<Placemark id="051314">
<name>HI Hostel</name>
<description><![CDATA[<div style="color: #404040;font-size: 12px"><a "#book"style="color:#295181;font-size: 12px" target="_top" href="http://www.hihostels.com/dba/hostel051314.de.htm?himap=Y#book" >Girona - Equity Point Girona</a><img style="margin: 5px 0px 5px 0px; border-color:#909090; padding:2px; display:block; clear:both;" src="http://www.hihostels.com/pics/ES/051314_pic_main.jpeg" width="96" height="72" border="1">Plaça Catalunya, 23<br>Girona<br>17002<br><b>Spanien</b><br><div style="margin-top:3px;"><img style="vertical-align:middle;margin-right:5px;" src="http://www.hihostels.com/imgfront/pegsmall.png" /><a style="color:#295181;font-size: 12px;" href="http://www.hihostels.com/openSVwindow(41.981658,2.823057)">Street View</a></div></div> ]]></description>

My source files look like the one above (basically coming from http://www.hihostels.com/mapcoord/ES.en.kml ). 我的源文件看起来像上面的文件(基本上来自http://www.hihostels.com/mapcoord/ES.en.kml )。 I want to replace the (useless) name tag "HI Hostel" (always the same for every placemark) with the hostels real name. 我想用旅馆的真实姓名代替(无用的)姓名标签“ HI Hostel”(每个地标都一样)。 The real name appears in the description tag one line below, in the case above it would be "Girona - Equity Point Girona". 真实名称显示在下面的描述标签中的一行中,在上面的情况下为“ Girona-Equity Point Girona”。

Any clever idea on how to do this? 关于如何执行此操作的任何聪明想法? Thanks for reading. 谢谢阅读。

Some like this? 像这样吗? Using awk 使用awk

awk -F, '/^<name>/ {next} /^<description/ {s=$0;gsub(/<[^>]*>/, ",");$0="<name>" $4 "</name>\n" s} 1' file
<Placemark id="051314">
<name>Girona - Equity Point Girona</name>
<description><![CDATA[<div style="color: #404040;font-size: 12px"><a "#book"style="color:#295181;font-size: 12px" target="_top" href="http://www.hihostels.com/dba/hostel051314.de.htm?himap=Y#book" >Girona - Equity Point Girona</a><img style="margin: 5px 0px 5px 0px; border-color:#909090; padding:2px; display:block; clear:both;" src="http://www.hihostels.com/pics/ES/051314_pic_main.jpeg" width="96" height="72" border="1">Plaça Catalunya, 23<br>Girona<br>17002<br><b>Spanien</b><br><div style="margin-top:3px;"><img style="vertical-align:middle;margin-right:5px;" src="http://www.hihostels.com/imgfront/pegsmall.png" /><a style="color:#295181;font-size: 12px;" href="http://www.hihostels.com/openSVwindow(41.981658,2.823057)">Street View</a></div></div> ]]></description>

This may also work: 这也可能起作用:

awk -F"<|>" '/^<name>/ {next} /^<description/ {$0="<name>" $8 "</name>\n" $0} 1' file

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM