I have a (tab-delimited) file where the first "word" on each line is the line number. However, some line numbers are missing. I want to insert new lines (with corresponding line number) so that throughout the file, the number printed on the line matches the actual line number. (This is for later consumption into readarray with cut/awk to get the line after the line number.)
I've written this logic in python and tested it works, however I need to run this in an environment that doesn't have python. The actual file is about 10M rows. Is there a way to represent this logic using sed, awk, or even just plain shell / bash?
linenumre = re.compile(r"^\d+")
i = 0
for line in sys.stdin:
i = i + 1
linenum = int(linenumre.findall(line)[0])
while (i < linenum):
print(i)
i = i + 1
print(line, end='')
test file looks like:
1 foo 1
2 bar 1
4 qux 1
6 quux 1
9 2
10 fun 2
expected output like:
1 foo 1
2 bar 1
3
4 qux 1
5
6 quux 1
7
8
9 2
10 fun 2
Like this, with awk
:
awk '{while(++ln!=$1){print ln}}1' input.txt
Explanation, as a multiline script:
{
# Loop as long as the variable ln (line number)
# is not equal to the first column and insert blank
# lines.
# Note: awk will auto-initialize an integer variable
# with 0 upon its first usage
while(++ln!=$1) {
print ln
}
}
1 # this always expands to true, making awk print the input lines
I've written this logic in python and tested it works, however I need to run this in an environment that doesn't have python.
In case you want to have running python code where python is not installed you might freeze your code. The Hitchhiker's Guide to Python has overview of tools which are able to do it. I suggest first trying pyinstaller as it support various operation system and seems easy to use.
This might work for you (GNU join, seq and join):
join -a1 -t' ' <(seq $(sed -n '$s/ .*//p' file)) file 2>/dev/null
Join a file created by the command seq
using the last line number in file
with file
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.