how to add/remove, number or range of numbers from a file and reorganize the range
for example in file
$ cat test.in
cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]
Requirement to remove cn01 and cn05
desired output
$ cat test.in
cn[02-04,06-10]
cn[02,07-09]
cn[02]
Here's how to expand your lists and ranges of values into individual values:
$ cat tst.awk
function expand(exprStr,valsArr, i,terms,term,range,val,numVals) {
gsub(/cn|[][]/,"",exprStr)
delete valsArr
# exprStr = 01,02,07-09
split(exprStr,terms,/,/)
for (i=1; i in terms; i++) {
# terms[1]=01, [2]=02, [3]=07-09
term = terms[i]
split(term,range,/-/)
range[2] = (2 in range ? range[2] : range[1])
for (val=range[1]; val<=range[2]; val++) {
# range[1]=07, [2]=09
valsArr[++numVals] = sprintf("%02d",val)
}
}
}
{
print "--------", $0
expand($0,arr)
for (i=1; i<=length(arr); i++) {
print i, "cn"arr[i]
}
}
.
$ awk -f tst.awk file
-------- cn[01-10]
1 cn01
2 cn02
3 cn03
4 cn04
5 cn05
6 cn06
7 cn07
8 cn08
9 cn09
10 cn10
-------- cn01
1 cn01
-------- cn[01,02,07-09]
1 cn01
2 cn02
3 cn07
4 cn08
5 cn09
-------- cn[01-02]
1 cn01
2 cn02
Now just delete the values you don't want from the array and essentially do the reverse to recombine into your input format.
Example in Python 3
import re
from itertools import groupby
inp = """cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]"""
rem = {1, 5}
def parse_lst(lst_str):
for group in lst_str.split(','):
if '-' in group:
first, last = group.split('-')
yield from range(int(first), int(last)+1)
else:
yield int(group)
def format_range(range_):
ranges = []
for k, g in groupby(enumerate(range_), lambda x: x[0]-x[1]):
group = [n for i, n in g]
ranges.append((group[0], group[-1]))
if not ranges:
return
print("cn[" + ','.join(
'{:02d}'.format(first) if first == last else
'{:02d}-{:02d}'.format(first, last) for
first, last in ranges
) + ']')
for line in inp.splitlines():
lst_match = re.search(r'\[(.*)\]', line)
if lst_match:
range_ = parse_lst(lst_match.group(1))
else:
range_ = (int(line[2:]),)
filtered = sorted(set(range_) - rem)
format_range(filtered)
prints
cn[02-04,06-10]
cn[02,07-09]
cn[02]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.