[英]How do I remove all the text after a certain character on a line, and do the same on each line?
请道歉我的头衔,这有点令人困惑。
我有一个看起来像这样的日志文件:
201.94.198.242 - - [28/Dec/2013:01:59:11 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
它扩展了几乎20万行。
我需要获取所有这些IP,以便可以将其阻止在防火墙上。
为此,我认为我可以删除每行- -
之后的所有内容,然后删除所有重复的行。
如何使用linux工具(awk,sed,grep等)来做到这一点?
这是使用awk
的另一种方式:
awk '!a[$1]++ { print $1 }' file
您可以使用cut -d' ' -f1 logfile
。 然后,您可能希望通过sort和uniq将其通过管道传输,因为那里似乎有一些重复项。
我想我会用类似的东西:
sed "s/ .*$//" <logfile.txt | sort -u
另一种可能性是:
gawk " { address[$1]=1 } END { for (a in address) print a;}" < input
用这个逗号
$ awk '{print $1}' < test | uniq -d
除了awk
sed
cut
,您还可以使用grep
grep -o '^[^ ]*' file | sort -u
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.