I have a long file of the type
Processin SCRIPT10 file..
Submitted batch job 1715572
Processin SCRIPT100 file..
Processin SCRIPT1000 file..
Submitted batch job 1715574
Processin SCRIPT10000 file..
Processin SCRIPT10001 file..
Processin SCRIPT10002 file..
Submitted batch job 1715577
Processin SCRIPT10003 file..
Submitted batch job 1715578
Processin SCRIPT10004 file..
Submitted batch job 1715579
I want to find out jobs (script names) that were not submitted. That means there is not line submitted batch job right after processing line.
So far I have tried to do that task using
pcregrep -M "Processin.*\n.*Processin" execScripts2.log | awk 'NR % 2 == 0'
But it does not handle properly the situation when multiple scripts does not get processed. It outputs, surprisingly, only SCRIPT1000 and SCRIPT10001 lines. Can you show me a better one-liner?
Ideally the output would be only the lines without 'Submitted' on the next line (or just script names) that means:
SCRIPT100
SCRIPT10000
SCRIPT10001
Thanks.
This awk
can do the job:
awk -v s='Submitted' '$1 != s{if(p != "") print p; p=$2} $1 == s{p=""}' file
SCRIPT100
SCRIPT10000
SCRIPT10001
Reference: Effective AWK Programming
Without using awk
you could write a bash command/file and run it. If you have less knowledge of awk
then this bash script works better if you want further customization.
#!/bin/bash
tempText=""
Processing="Processin"
while read line
do
tempText=$line
if [[ "$line" == Processin* ]];
tempText=$line
then
read line
if [[ "$line" != Submitted* ]];
then
echo $tempText
tempText=$line
while read line
do
if [[ "$line" != Submitted* ]];
then
echo $tempText
tempText=$line
else
break
fi
done
fi
fi
Run using ./check.sh filename
The current answer works fine though.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.