简体   繁体   中英

sgrep how to reset region counter inside enclosing tag

Using sgrep, structured grep, how can I reset sgrep's region counter, so that sgrep starts counting from 1 again inside each <tr> element.

Consider the following sample input html table fragment. It has an irregular structure; it has multiple tags on the same line, and a variable number of td tags inside each tr tag:

<tr><td>2015</td><td>Jane</td>
    <td>Smith</td></tr>
<tr><td>2011</td>
    <td>Sarah</td>
</tr>

My sample sgrep command-line is:

sgrep -o'--%n:%r--\n' '"<td>" .. "</td>"' in.txt

I get this output:

--1:<td>2015</td>--
--2:<td>Jane</td>--
--3:<td>Smith</td>--
--4:<td>2011</td>--
--5:<td>Sarah</td>--

Instead I would like to get this output:

--1:<td>2015</td>--
--2:<td>Jane</td>--
--3:<td>Smith</td>--
--1:<td>2011</td>--
--2:<td>Sarah</td>--

with sgrep's region counter %n resetting to 1 each time it enters a tr tag.

There is no way to reset the region counter %n of the sgrep output format patterns. Thus one needs to use some other tools, such as awk suggested by ritesht93, to solve this task. In general, the sgrep output format pattern given with the -o switch allows the result regions to be decorated (or replaced) in a quite simplistic manner only. That is, the value of the search expression is a set of regions, without any information of their local context such as surrounding elements. The output format pattern is simply applied to each region in the result, in their default order, and the result of each application is appended to the output.

Regards, Pekka Kilpeläinen, co-designer of the original sgrep

you can also do it with a simple 1 liner awk :

$ cat file1
<tr>
    <td>2015</td>
    <td>Jane</td>
    <td>Smith</td>
</tr>
<tr>
    <td>2011</td>
    <td>Sarah</td>
    <td>Holmes</td>
</tr>
$ awk -v cnter=0 '/td/ {cnter=cnter%3+1; print cnter":"$1}' file1
1:<td>2015</td>
2:<td>Jane</td>
3:<td>Smith</td>
1:<td>2011</td>
2:<td>Sarah</td>
3:<td>Holmes</td>
$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM