I have CSV files that look like this:
786,1702
787,1722
-,1724
788,1769
789,1766
I would like to have a bash command that searches the first column for the -
and if found then shifts the values in the second column down. The -
reccurr several times in the first column and would need to start from the top to preserve the order of the second column.
The second column would be blank
Desired output:
786,1702
787,1722
-,
788,1724
789,1769
790,1766
So far I have: awk -F ',' '$1 ~ /^-$/' filename.csv
to find the hyphens, but shifting the 2nd column down is tricky...
Assuming that the left column continues with incremental IDs to shift the right column until it is empty.
awk 'BEGIN{start=0;FS=","}$1=="-"{stack[stacklen++]=$2;print $1",";next}stacklen-start{stack[stacklen++]=$2;print $1","stack[start];delete stack[start++];next}1;END{for (i=start;i<stacklen;i++){print $1-start+i+1,stack[i]}}' filename.csv
# or
<filename.csv awk -F, -v start=0 '$1=="-"{stack[stacklen++]=$2;print $1",";next}stacklen-start{stack[stacklen++]=$2;print $1","stack[start];delete stack[start++];next}1;END{for (i=start;i<stacklen;i++){print $1-start+i+1,stack[i]}}'
Or, explained:
I am here using a shifted stack to avoid rewriting indexes. With start
as the pointer to the first useful element of the stack, and stacklen
as the last element. This avoids the costly operation of shifting all array elements whenever we want to remove the first element.
# chmod +x shift_when_dash
./shift_when_dash filename.csv
with shift_when_dash being an executable file containing:
#!/usr/bin/awk -f
BEGIN { # Everything in this block is executed once before opening the file
start = 0 # Needed because we are using it in a scalar context before initialization
FS = "," # Input field separator is a comma
}
$1 == "-" { # We match the special case where the first column is a simple dash
stack[stacklen++] = $2 # We store the second column on top of our stack
print $1 "," # We print the dash without a second column as asked by OP
next # We stop processing the current record and go on to the record
}
stacklen - start { # In case we still have something in our stack
stack[stacklen++] = $2 # We store the current 2nd column on the stack
print $1 "," stack[start] # We print the current ID with the first stacked element
delete stack[start++] # Free up some memory and increment our pointer
next
}
1 # We print the line as-is, without any modification.
# This applies to lines which were not skipped by the
# 'next' statements above, so in our case all lines before
# the first dash is encountered.
END {
for (i=start;i<stacklen;i++) { # For every element remaining in the stack after the last line
print $1-start+i+1 "," stack[i] # We print a new incremental id with the stack element
}
}
next is an awk statement similar to continue
in other languages, with the difference that it skips to the next input line
instead of the next loop element
. It is useful to emulate a switch-case
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.