简体   繁体   中英

How do I remove all the text after a certain character on a line, and do the same on each line?

Please apologize my title, is kind of confusing.

I have a log file that looks like this:

201.94.198.242 - - [28/Dec/2013:01:59:11 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
117.242.220.51 - - [28/Dec/2013:01:59:19 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
177.35.108.173 - - [28/Dec/2013:01:59:24 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
186.236.21.100 - - [28/Dec/2013:01:59:38 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.34.32.45 - - [28/Dec/2013:01:59:44 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.150.84.114 - - [28/Dec/2013:01:59:47 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.47.62.216 - - [28/Dec/2013:01:59:57 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
179.192.251.45 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.40.147.43 - - [28/Dec/2013:02:00:23 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
115.132.84.106 - - [28/Dec/2013:02:00:30 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
187.15.138.179 - - [28/Dec/2013:02:01:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
177.158.211.34 - - [28/Dec/2013:02:01:04 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
201.26.91.150 - - [28/Dec/2013:02:01:25 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.70.11.207 - - [28/Dec/2013:02:01:36 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
200.18.43.2 - - [28/Dec/2013:02:01:40 -0200] "GET /.peide/ HTTP/1.0" 404 384 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
189.188.213.172 - - [28/Dec/2013:02:01:43 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (Windows; MSIE 6.0; Windows NT 5.2)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.1" 502 568 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"
203.101.73.51 - - [28/Dec/2013:02:02:00 -0200] "GET /.peide/ HTTP/1.0" 404 240 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98; Win 9x 4.90)"

It extends for pretty much 200 thousand lines.

I need to get all those IPs so I can block them on my firewall.

To do that, I think I could delete everything after - - on each line, and then remove all the duplicate lines.

How can I do that using linux tools (awk, sed, grep, etc) ?

这是使用awk的另一种方式:

awk '!a[$1]++ { print $1 }' file

You could use something like cut -d' ' -f1 logfile to get everything up to the first space. You may want to then pipe that through sort and uniq because you seem to have some duplicates there.

I think I'd use something like:

sed "s/ .*$//" <logfile.txt | sort -u

Another possibility would be something like:

gawk " { address[$1]=1 } END { for (a in address) print a;}" < input

用这个逗号

$ awk '{print $1}' < test | uniq -d

除了awk sed cut ,您还可以使用grep

grep -o '^[^ ]*' file  | sort -u

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM