简体   繁体   中英

how to add/remove, number or range of numbers from a file and reorganize the range

how to add/remove, number or range of numbers from a file and reorganize the range

for example in file

$ cat test.in
cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]

Requirement to remove cn01 and cn05

desired output

$ cat test.in
cn[02-04,06-10]
cn[02,07-09]
cn[02]

Here's how to expand your lists and ranges of values into individual values:

$ cat tst.awk
function expand(exprStr,valsArr,        i,terms,term,range,val,numVals) {
    gsub(/cn|[][]/,"",exprStr)
    delete valsArr
    # exprStr = 01,02,07-09
    split(exprStr,terms,/,/)
    for (i=1; i in terms; i++) {
        # terms[1]=01, [2]=02, [3]=07-09
        term = terms[i]
        split(term,range,/-/)
        range[2] = (2 in range ? range[2] : range[1])
        for (val=range[1]; val<=range[2]; val++) {
            # range[1]=07, [2]=09
            valsArr[++numVals] = sprintf("%02d",val)
        }
    }
}
{
    print "--------", $0
    expand($0,arr)
    for (i=1; i<=length(arr); i++) {
        print i, "cn"arr[i]
    }
}

.

$ awk -f tst.awk file
-------- cn[01-10]
1 cn01
2 cn02
3 cn03
4 cn04
5 cn05
6 cn06
7 cn07
8 cn08
9 cn09
10 cn10
-------- cn01
1 cn01
-------- cn[01,02,07-09]
1 cn01
2 cn02
3 cn07
4 cn08
5 cn09
-------- cn[01-02]
1 cn01
2 cn02

Now just delete the values you don't want from the array and essentially do the reverse to recombine into your input format.

Example in Python 3

import re
from itertools import groupby

inp = """cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]"""

rem = {1, 5}

def parse_lst(lst_str):
    for group in lst_str.split(','):
        if '-' in group:
            first, last = group.split('-')
            yield from range(int(first), int(last)+1)
        else:
            yield int(group)

def format_range(range_):
    ranges = []
    for k, g in groupby(enumerate(range_), lambda x: x[0]-x[1]):
        group = [n for i, n in g]
        ranges.append((group[0], group[-1]))

    if not ranges:
        return

    print("cn[" + ','.join(
        '{:02d}'.format(first) if first == last else
        '{:02d}-{:02d}'.format(first, last) for
        first, last in ranges
    ) + ']')

for line in inp.splitlines():
    lst_match = re.search(r'\[(.*)\]', line)
    if lst_match:
        range_ = parse_lst(lst_match.group(1))
    else:
        range_ = (int(line[2:]),)

    filtered = sorted(set(range_) - rem)
    format_range(filtered)

prints

cn[02-04,06-10]
cn[02,07-09]
cn[02]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM