[英]How to extract the month, day of month and IP address from the line using sed or awk?
I have extracted the day/month and IP that are continuously bruteforcing my IMAP server: 我已经提取了持续强制我的IMAP服务器的日/月和IP:
Nov1 unknown[186.216.99.239]:
Nov1 unknown[62.249.196.214]:
Nov1 unknown[110.145.123.120]:
Nov1 fixed-187-190-251-149.totalplay.net[187.190.251.149]:
Nov1 pd9568164.dip0.t-ipconnect.de[217.86.129.100]:
Nov1 unknown[103.227.88.130]:
I want the output to be like below: 我想输出如下:
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130
I achieved this result using combination of sed, awk and cut with below code, but I would like to learn if there are better ways? 我使用sed,awk和cut的组合用下面的代码实现了这个结果,但是我想知道是否有更好的方法?
while read -r line
do
monthday=$(echo $line | awk '{ print $1 }')
# ip=$(echo $line | awk -F'[\\\[\\\]]' { print $2 } )
ip=$(echo $line| cut -d[ -f2| cut -d] -f1 )
echo "${monthday} ${ip}"
done < badIpList.txt
With awk: set the field separator to the any of space, [
or ]
, then print the first and third field: 使用awk:将字段分隔符设置为任意空格, [
或]
,然后打印第一个和第三个字段:
$ awk -F "[][ ]" '{ print $1, $3 }' infile
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130
Notice that the field separator is a regular expression, namely the bracket expression [][ ]
. 请注意,字段分隔符是正则表达式,即括号表达式[][ ]
。 From the gawk manual : 从gawk手册 :
To include one of the characters
\\
,]
,-
, or^
in a bracket expression, put a\\
in front of it. 要在括号表达式中包含其中一个字符\\
,]
,-
或^
,请在其前面放置\\
。
So the expression would have to be 所以表达必须是
[\[\] ]
but because regular expressions stored in strings (" dynamic/computed regexps ") are scanned twice , we have to escape the backslash: 但由于存储在字符串中的正则表达式(“ 动态/计算的正则表达式”)被扫描两次 ,我们必须转义反斜杠:
-F '[\\[\\] ]'
or to use double quotes, as I did, I'd have to escape both the backslash and the backslash escaping it: 或者使用双引号,就像我一样,我必须逃避反斜杠和反斜杠逃避它:
-F "[\\\[\\\] ]"
which clearly isn't all too readable. 这显然不是太可读。 Thankfully, there is a loophole: 谢天谢地,有一个漏洞:
Additionally, if you place
]
right after the opening[
, the closing bracket is treated as one of the characters to be matched. 此外,如果放置]
在打开后立即[
,闭合托架被视为要匹配的字符之一。
so we get away with 所以我们侥幸逃脱
-F "[][ ]"
even within double quotes. 甚至在双引号内。 There is no real reason to use double quotes here, by the way. 顺便说一下,这里没有真正的理由使用双引号。
try this 尝试这个
sed -E 's/\s.*\[(.*)\]:/ \1/' file
no loops needed. 不需要循环。
awk solution: awk解决方案:
awk -F'[[:space:]\\[\\]]' '{print $1,$3}' file
-F'[[:space:]\\\\[\\\\]]'
- complex field separator, either whitespace [:space:]
or [
or ]
. -F'[[:space:]\\\\[\\\\]]'
- 复杂的字段分隔符,可以是空格[:space:]
或[
或]
。 Thereby, the line, for ex. 因此,该行,例如。 Nov1 unknown[186.216.99.239]:
will be divided into fields: 1) Nov1
, 2) unknown
, 3) 186.216.99.239
and 4) :
Nov1 unknown[186.216.99.239]:
将分为以下几个领域:1) Nov1
月1 Nov1
,2) unknown
,3) 186.216.99.239
和4) :
The output: 输出:
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130
Simple with this sed: 这个sed很简单:
$ sed -r 's|^([^ ]*)[^[]*\[([^]]*)\].*|\1 \2|' badIpList.txt
Nov1 186.216.99.239
Nov1 62.249.196.214
Nov1 110.145.123.120
Nov1 187.190.251.149
Nov1 217.86.129.100
Nov1 103.227.88.130
Logic: Print the first word and the contents of the square bracket. 逻辑:打印第一个单词和方括号的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.