简体   繁体   中英

How to extract the month, day of month and IP address from the line using sed or awk?

I have extracted the day/month and IP that are continuously bruteforcing my IMAP server:

Nov1 unknown[186.216.99.239]:
Nov1 unknown[62.249.196.214]:
Nov1 unknown[110.145.123.120]:
Nov1 fixed-187-190-251-149.totalplay.net[187.190.251.149]:
Nov1 pd9568164.dip0.t-ipconnect.de[217.86.129.100]:
Nov1 unknown[103.227.88.130]:

I want the output to be like below:

Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130

I achieved this result using combination of sed, awk and cut with below code, but I would like to learn if there are better ways?

while read -r line
    do 
        monthday=$(echo $line | awk '{ print $1 }')
        # ip=$(echo $line | awk -F'[\\\[\\\]]' { print $2 } ) 
        ip=$(echo $line| cut -d[ -f2| cut -d] -f1 )
        echo "${monthday} ${ip}"
    done < badIpList.txt

With awk: set the field separator to the any of space, [ or ] , then print the first and third field:

$ awk -F "[][ ]" '{ print $1, $3 }' infile
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130

Notice that the field separator is a regular expression, namely the bracket expression [][ ] . From the gawk manual :

To include one of the characters \\ , ] , - , or ^ in a bracket expression, put a \\ in front of it.

So the expression would have to be

[\[\] ]

but because regular expressions stored in strings (" dynamic/computed regexps ") are scanned twice , we have to escape the backslash:

-F '[\\[\\] ]'

or to use double quotes, as I did, I'd have to escape both the backslash and the backslash escaping it:

-F "[\\\[\\\] ]"

which clearly isn't all too readable. Thankfully, there is a loophole:

Additionally, if you place ] right after the opening [ , the closing bracket is treated as one of the characters to be matched.

so we get away with

-F "[][ ]"

even within double quotes. There is no real reason to use double quotes here, by the way.

try this

sed -E 's/\s.*\[(.*)\]:/ \1/' file

no loops needed.

awk solution:

awk -F'[[:space:]\\[\\]]' '{print $1,$3}' file
  • -F'[[:space:]\\\\[\\\\]]' - complex field separator, either whitespace [:space:] or [ or ] . Thereby, the line, for ex. Nov1 unknown[186.216.99.239]: will be divided into fields: 1) Nov1 , 2) unknown , 3) 186.216.99.239 and 4) :

The output:

Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130

Simple with this sed:

$ sed -r 's|^([^ ]*)[^[]*\[([^]]*)\].*|\1 \2|' badIpList.txt
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130

Logic: Print the first word and the contents of the square bracket.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM