简体   繁体   English

Shell脚本grep regex数组

[英]shell script grep regex in array

I have this script where I print how many times an ip has failed to connect, and at what date this IP made its last try, it looks like this. 我有这个脚本,可以在其中打印ip连接失败的次数,以及该IP在最后一次尝试使用的日期,看起来像这样。

#!/bin/bash
searchString=$1
file=$2

countLines()
{
    declare -A ipCount
    declare -A lastDate

    cnt=0

    while read line;
    do
        ((cnt+=1))

        ipaddr=$( echo "$line" | grep -o -E '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' )

         lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z]\ [0-3][0-9]\ [0-2][0-9]\:[0-2][2-9]\:[0-2][2-9]' )

    ((ipCount[$ipaddr]+=1))
    done

    printf "%-18s %-10s %s\n" "IP" "Count" "lastDate"
    echo "-------------------------------------------------"

    for ip in ${!ipCount[*]}
    do
        printf "%-18s %-10s %s\n" "$ip" "${ipCount[$ip]}" "${lastDate[$ip]}"
    done | sort

    echo "--------------------------------------------------"
    echo "Count: $cnt"
    }

    grep "$searchString" $file | countLines

The file I try this on looks like this, but bigger 我尝试的文件看起来像这样,但是更大

May 16 06:41:38 aprs sshd[25951]: Failed password for root from 137.241.229.226 port 2008 ssh2
May 16 06:41:40 aprs sshd[25951]: Failed password for root from 137.241.229.226 port 2008 ssh2
May 16 06:41:43 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2
May 16 06:41:46 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2
May 16 06:41:48 aprs sshd[25951]: Failed password for root from 37.141.229.226 port 2008 ssh2

and what I get is this 我得到的是这个

IP                 Tries      LastDate
-----------------------------------------------
37.141.229.226     205        
137.241.229.226    705        May 16 07:08:24
-----------------------------------------------
Count: 910

As you can see, I only get 'lastDate' on one of the IP's, this also happens on the big log file, I guess it's really simple, but I can't find out why, can you help me? 如您所见,我只在其中一个IP上获得“ lastDate”,这也发生在大日志文件中,我想这确实很简单,但我找不到原因,您能帮我吗?

I run the script like: bash scriptname.sh "Failed password for root" logFile 我像这样运行脚本:bash scriptname.sh“ root用户密码失败” logFile

The issue appears to be in the regex for lastDate . 问题似乎在lastDate的正则表达式中。 Replace: 更换:

lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z]\ [0-3][0-9]\ [0-2][0-9]\:[0-2][2-9]\:[0-2][2-9]' )

with: 有:

lastDate[$ipaddr]=$( echo "$line" | grep -o -E '[a-zA-Z][a-zA-Z][a-zA-Z] [0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]' )

The key part was the match for hour:minute:second. 关键部分是小时:分钟:秒的匹配。 The original had [0-2][0-9]\\:[0-2][2-9]\\:[0-2][2-9] . 原始文件具有[0-2][0-9]\\:[0-2][2-9]\\:[0-2][2-9] This restricts the match to times from the top of the hour to half past and also restricts the match to only the first half of each minute. 这将比赛限制为从一小时到凌晨一半的时间,并且还将比赛限制为每分钟的前半部分。 The more general replacement is 0-2][0-9]:[0-5][0-9]:[0-5][0-9] 更一般的替换是0-2][0-9]:[0-5][0-9]:[0-5][0-9]

Also, spaces and colons are not active characters for grep . 另外,空格和冒号不是grep活动字符。 Consequently, they do not need to be escaped. 因此,它们不需要逃脱。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM