I have to run a third-party program in background and capture its output to file. I'm doing this simply using the_program > output.txt
. However, the coders of said program decided to be flashy and show processed lines in real-time, using \\b
characters to erase the previous value. So, one of the lines in output.txt ends up like Lines: 1(b)2(b)3(b)4(b)5
, (b)
being an unprintable character with ASCII code 08
. I want that line to end up as Lines: 5
.
I'm aware that I can write it as-is and post-process the file using AWK , but I wonder if it's possible to somehow process the control characters in-place, by using some kind of shell option or by piping some commands together, so that line would become Lines: 5
without having to run any additional commands after the program is done?
Edit:
Just a clarification: what I wrote here is a simplified version, actual line count processed by the program is a hundred thousands, so that string ends up quite long.
Thanks for your comments! I ended up piping the output of that program to AWK Script I linked in the question. I get a well-formed file in the end.
the_program | ./awk_crush.sh > output.txt
The only downside is that I get the output only once the program itself is finished, even though the initial output exceeds 5M and should be passed in the lesser chunks. I don't know the exact reason, perhaps AWK script waits for EOF on stdin. Either way, on more modern system I would use
stdbuf -oL the_program | ./awk_crush.sh > output.txt
to process the output line-by-line. I'm stuck on RHEL4 with expired support though, so I'm unable to use neither stdbuf
nor unbuffer
. I'll leave it as-is, it's fine too.
The contents of awk_crush.sh are based on this answer , except with ^H
sequences (which are supposed to be ASCII 08
characters entered via VIM commands) replaced with escape sequence \\b
:
#!/usr/bin/awk -f
function crushify(data) {
while (data ~ /[^\b]\b/) {
gsub(/[^\b]\b/, "", data)
}
print data
}
crushify($0)
Basically, it replaces character before \\b
and \\b
itself with empty string, and repeats it while there are \\b
in the string - just what I needed. It doesn't care for other escape sequences though, but if it's necessary, there's a more complete SED solution by Thomas Dickey .
Pipe it to col -b
, from util-linux :
the_program | col -b
Or, if the input is a file, not a program:
col -b < input > output
Mentioned in Unix & Linux: Evaluate large file with ^H and ^M characters .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.