[英]Unterminated address regex - misapplying escape characters in bash sed script?
Just learning sed, and I feel like I'm getting close to doing what I want, just missing something obvious. 只是学习sed,我觉得我快要完成自己想做的事情,只是缺少明显的东西。
The objective is to take bunch of <tr>...</tr>
s in an html table and appended it to the single table in another page. 目的是在HTML表格中获取一堆<tr>...</tr>
并将其附加到另一页的单个表格中。 So I want to take the initial file, strip everything above the first time I use <tr>
and everything from </table>
on down, then insert it just above the </table>
in the other file. 因此,我想获取初始文件,在第一次使用<tr>
所有内容剥离,并从</table>
向下剥离所有内容,然后将其插入另一个文件中</table>
正上方。 So like below, except <tr>
and </tr>
are on their own lines, if it matters. 因此,就像下面一样,如果重要的话,除了<tr>
和</tr>
都是独立的。
Input File: Target File:
<html><body> <html><body>
<p>Whatever...</p> <p>Other whatever...</p>
<table> <table>
<tr><td>4</td></tr> <thead>
<tr><td>5</td></tr> <tr><th>#</th></tr>
<tr><td>6</td></tr> </thead>
</table> <tbody>
</body></html> <tr><td>1</td></tr>
<tr><td>2</td></tr>
<tr><td>3</td></tr>
</tbody>
</table>
</body></html>
Becomes: 变为:
Input file Target File:
doesn't matter. <html><body>
<p>Other whatever...</p>
<table>
<thead>
<tr><th>#</th></tr>
</thead>
<tbody>
<tr><td>1</td></tr>
<tr><td>2</td></tr>
<tr><td>3</td></tr>
<tr><td>4</td></tr>
<tr><td>5</td></tr>
<tr><td>6</td></tr>
</tbody>
</table>
</body></html>
Here's the code I'm trying to use: 这是我要使用的代码:
#!/bin/bash
#$1 is the first parameter and $2 is the second parameter being passed when calling the script. The variable filename will be used to refer to this.
input=$1
inserttarget=$2
sed -e '/\<\/thead\>,$input' $input
sed -e '/\<\/table\>,$input' $input
sed -n -i -e '\<\/tbody\>/r' $inserttarget -e 1x -e '2,${x;p}' -e '${x;p}' $input
Pretty sure it's pretty simple, just messing the expression up. 可以肯定的是,它很简单,只是弄乱了表达式。 Can anyone set me straight? 谁能让我挺直?
Here I cut the problem in two: 1. Cut the rows from the input 2. Paste those rows in the output file 在这里,我将问题分成两部分:1.剪切输入中的行2.将这些行粘贴到输出文件中
sed -n '\\:<table>:,\\:</table>:p' ${input} | sed -n '\\:<tr>:p'
This line will remove all lines containing <tr>
in the block ranging from the first line matching <table>
to the first line matching </table>
. 该行将删除块中所有包含<tr>
的行,从第一行匹配<table>
到第一行匹配</table>
。 All those freshly cut lines are printed in the standard output. 所有这些新切割的线都打印在标准输出中。
sed -i '\\:</tbody>: { r /dev/stdin a </tbody> d}' ${inserttarget}
This multi-line command will add the lines read from stdin
after the line matching </tbody>
. 此多行命令将在匹配</tbody>
之后添加从stdin
读取的行。 Then we move the </tbody>
by appending it after the new lines and removing the old one. 然后,通过将</tbody>
附加在新行之后并删除旧行来移动</tbody>
。
Another trick used here is to replace the default regex delimiter /
by :
, so that we can use '/' in our matching pattern. 这里使用的另一个技巧是替换默认的正则表达式的分隔符/
由:
,这样我们就可以在我们的匹配模式中使用“/”。
Final sotuion : 最终解决方案 :
sed -i '\:</tbody>: {
r /dev/stdin
a </tbody>
d}' ${inserttarget} < <(sed -n '\:<table>:,\:</table>:p' ${input} | sed -n '\:<tr>:p')
Et voila! 瞧!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.