繁体   English   中英

如何删除一行中某个字符之后的所有文本,并在每一行中都执行相同的操作?

[英]How do I remove all the text after a certain character on a line, and do the same on each line?

请道歉我的头衔,这有点令人困惑。

我有一个看起来像这样的日志文件:

201.94.198.242 - - [28/Dec/2013:01:59:11 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"

它扩展了几乎20万行。

我需要获取所有这些IP,以便可以将其阻止在防火墙上。

为此,我认为我可以删除每行- -之后的所有内容,然后删除所有重复的行。

如何使用linux工具(awk,sed,grep等)来做到这一点?

这是使用awk的另一种方式:

awk '!a[$1]++ { print $1 }' file

您可以使用cut -d' ' -f1 logfile 然后,您可能希望通过sort和uniq将其通过管道传输,因为那里似乎有一些重复项。

我想我会用类似的东西:

sed "s/ .*$//" <logfile.txt | sort -u

另一种可能性是:

gawk " { address[$1]=1 } END { for (a in address) print a;}" < input

用这个逗号

$ awk '{print $1}' < test | uniq -d

除了awk sed cut ,您还可以使用grep

grep -o '^[^ ]*' file  | sort -u

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM