简体   繁体   中英

grep file with a large array

Hi i have a few archive of FW log and occasionally im required to compare them with a series of IP addresses (thousand of them) to get the date and time if the ip addresses matches. my current script is as follow:

#input the list of ip into array
mapfile -t -O 1 var < ip.txt   while true
do
    #check array is not null
    if [[-n "${var[i]}"]] then  
    zcat /.../abc.log.gz | grep "${var[i]}"
    ((i++))

It does work but its way too slow and i would think that grep-ping a line with multiple strings would be faster than zcat on every ip line. So my question is is there a way to generate a 'long grep search string' from the ip.txt? or is there a better way to do this

Sure. One thing is that using cat is usually slightly inefficient. I'd recommend using zgrep here instead. You could generate a regex as follows

IP=`paste -s -d ' ' ip.txt`
zgrep -E "(${IP// /|})" /.../abc.log.gz

The first line loads the IP addresses into IP as a single line. The second line builds up a regex that looks something like (127.0.0.1|8.8.8.8) by replacing spaces with | 's. It then uses zgrep to search through abc.log.gz once, with that -E xtended regex.

However, I recommend that you do not do this. Firstly, you should escape strings put into a regex. Even if you know that ip.txt really contains IP addresses (eg not controlled by a malicious user), you should still escape the periods. But rather than building up a search string and then escape it, just use the -F ixed strings and -f ile features of grep . Then you get the simple and fast one-liner:

zgrep -F -f ip.txt /.../abc.log.gz

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM