簡體   English   中英

我如何從 java 7 中的 .txt 文件中提取主機名和主機請求的出現?

[英]how can i extract host name and occurence of host request from .txt file in java 7?

如何使用以下輸入文本文件的正則表達式哈希圖找到來自同一主機的主機名和請求數:

輸入文件

     unicomp6.unicompt.net - - [01/JUL/1995:00:00:06 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985     
     burger.letters.com - - [01/JUL/1995:00:00:12 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 0
     d104.aa.net - - [01/JUL/1995:00:00:13 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985 
     unicomp6.unicompt.net - - [01/JUL/1995:00:00:14 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 40310
     d104.aa.net - - [01/JUL/1995:00:00:15 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 40310 
     d104.aa.net - - [01/JUL/1995:00:00:15 - 0400] "GET /images/NASA-logosmall.gif HTTP/1.0" 200 786
     unicomp6.unicompt.net - - [01/JUL/1995:00:00:14 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 786 
     unicomp6.unicompt.net - - [01/JUL/1995:00:00:14 - 0400] "GET /shuttle/countdown/ HTTP/1.0" 200 1204 

所需的輸出:

   unicomp6.unicompt.net 4
   burger.letters.com 1
   d104.aa.net 3

為什么不使用正則表達式

public static void main(String[] args) {
    Pattern pattern = Pattern.compile("\\w+\\.\\w+\\.\\w+", Pattern.DOTALL);
    String input = "unicomp6.unicompt.net - - [01/JUL/1995:00:00:06 - 0400]"+
                    "burger.letters.com - - [01/JUL/1995:00:00:12 - 0400] .... etc";

    Matcher m = pattern.matcher(input);
    while (m.find()) {
      String s = m.group();
      System.out.println(s);  
    }
}

你可以試試這個(對於 Java 8 及更高版本):

public static void main(String[] params) throws IOException {

    try (Stream<String> lines = Files.lines(Paths.get("src/main/resources/input.txt"))) {

        Map<String, Integer> occurrences = new HashMap<>();
        lines.map( line -> line.split(" ") )
             .forEach( splitted -> {
                 occurrences.merge(splitted[0], 1, Integer::sum);
             } );

        System.out.print( occurrences );
    }

}

小心你的txt文件的路徑

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM