简体   繁体   中英

How can I extract the gif files requested by a GET request with Http response 200 from a log?

I have the next log file and I need to extract the gif files which were requested by a GET request and its status was 200.

unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:06 ‐0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:11 ‐0400] "GET /shuttle/countdown/liftoff.html HTTP/1.0" 304 0
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:12 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 304 0
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:12 ‐0400] "GET/shuttle/countdown/video/livevideo.gif HTTP/1.0" 200 0
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:13 ‐0400] "GET /shuttle/countdown/HTTP/1.0" 200 3985
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET/shuttle/countdown/count.gif HTTP/1.0" 200 40310
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 200 786
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET /images/KSC‐logosmall.gif HTTP/1.0" 200 1204
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:15 ‐0400] "GET/shuttle/countdown/count.gif HTTP/1.0" 200 40310
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:15 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 200 786

From the above example the response must be:

livevideo.gif
count.gif
NASA-logo.gif
KSC-logosmall.gif

As you can see in the response there are not duplicates, for example in the row 6 we have the count.gif record requested by a Get and with status 200, the same happens in the row 9 and in the response we have only one count.gif record.

Try reading the file one line at a time to extract the file name from each line. A regular expression would be useful here.

Store the gif file names in a Set to automatically eliminate the duplicates.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM