简体   繁体   中英

Linux grep and sort log files

I looked almost everywhere ( there ,there , there , there and there ) with no luck.

What I have here is a bunch of log files in a directory, where I need to look for a specific ID (myID) and sort the output by date. Here is an example:

in file1.log:

2015-09-26 15:39:50,788 - DEBUG - blabla : {'id' : myID}

in file2.log:

2015-09-26 15:39:51,788 - ERROR - foo : {'id' : myID}

in file3.log:

2015-09-26 15:39:48,788 - ERROR - bar : {'id' : myID}

Exepected output:

2015-09-26 15:39:48,788 - ERROR - bar : {'id' : myID}
2015-09-26 15:39:50,788 - DEBUG - blabla : {'id' : myID}
2015-09-26 15:39:51,788 - ERROR - foo : {'id' : myID}

What I am doing now (and it works pretty well), is:

grep -hri --color=always "myID" | sort -n

The only problem is that with the -h option of grep, the file names are hidden. I'd like to keep the file names AND keep the sorting. I tried:

grep -ri --color=always "myID" | sort -n -t ":" -k1,1 -k2,2

But it doesn't work. Basically, the grep command outputs the name of the file followed by ":", I'd like to sort the results from this character.

Thanks a lot

Try this:

grep --color=always "myID" file*.log | sort -t : -k2,2 -k3,3n -k4,4n

Output:

file3.log:2015-09-26 15:39:48,788 - ERROR - bar : {'id' : myID}
file1.log:2015-09-26 15:39:50,788 - DEBUG - blabla : {'id' : myID}
file2.log:2015-09-26 15:39:51,788 - ERROR - foo : {'id' : myID}

Another solution, a little bit longer but I think it should work:

 grep -l "myID" file* > /tmp/file_names && grep -hri "myID" file* | sort -n > /tmp/grep_result && paste /tmp/file_names /tmp/grep_result | column -s $'\t' -t

What it does basically is, first store files names by:

grep -l "myID" file* > /tmp/file_names

Store grep sorted results:

grep -hri "myID" file* | sort -n > /tmp/grep_result 

Paste the results column-wise (using a tab separator):

paste /tmp/file_names /tmp/grep_result | column -s $'\t' -t

The column ordering for sort is 1-based, so k1 will be your filename part. That means that in your attempt, you are sorting by filename, then by date and hour of your log line. Also, the -n means that you are using numeric ordering, which won't be playing nicely with yyyy-mm-dd hh:mm:ss format (it will read yyyy-mm-dd hh as only the first number, ie the year).

You can use:

sort -t ":" -k2

Note that I specified column 2 as the start, and left the end blank. The end defaults to the end-of-line.

If you want to sort specific columns, you need to explicitly set the start and end, for example: -k2,2 . You can use this to sort out-of-sequence columns, for example -k4,4 -k2,2 will sort by column 4 and use column 2 for tie-breaking.

You could also use -k2,4 , which would stop sorting at the colon just before your log details (ie it would use 2015-09-26 15:39:48,788 - ERROR - bar )

Finally, perhaps you want to have your log files in a consistent order if the time is the same:

sort -t ":" -k2,4 -k1,1

Try rust-based tool Super Speedy Syslog Searcher

(assuming you have rust installed )

cargo install super_speedy_syslog_searcher

then

s4 file1.log file2.log file3.log | grep "myID"

The only problem is that with the -h option of grep, the file names are hidden. I'd like to keep the file names AND keep the sorting.

You could try

$ s4 --color=never -nw file1.log file2.log file3.log | grep "myID"
file1.log:2015-09-26 15:39:48,788 - ERROR - bar : {'id' : myID}
file2.log:2015-09-26 15:39:50,788 - DEBUG - blabla : {'id' : myID}
file3.log:2015-09-26 15:39:51,788 - ERROR - foo : {'id' : myID}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM