简体   繁体   English

Java StringTokenizer用于特殊字符

[英]Java StringTokenizer for special character

I want dont tokenize between special character like " ", { }, [ ] how can i do ? 我不想在“”,“ {”,[]之类的特殊字符之间进行标记化,该怎么办?

String: "192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] 'GET /cgi-bin/try/ HTTP/1.0' 200 3395"

i want this output : 我想要这个输出:

192.168.2.20 
28/Jul/2006:10:27:10 -0300
GET /cgi-bin/try/ HTTP/1.0
200 3395

My code: 我的代码:

String rawData= "192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] 'GET /cgi-bin/try/ HTTP/1.0' 200 3395";
int i=0;
String[] s1=new String[100];
String delim = " ";
StringTokenizer tok = new StringTokenizer(rawData, delim, true);

boolean expectDelim = false;
while (tok.hasMoreTokens()) {
    String token = tok.nextToken();
    if (delim.equals(token)) {
        if (expectDelim) {
            expectDelim = false;
            continue;
        } else {
            token = null;
        }
    }
    s1[i]=token;
    System.out.println(s1[i]);
    i+=1;
    expectDelim = true;
    }
}

output: 输出:

192.168.2.20
-
-
[28/Jul/2006:10:27:10
-0300]
'GET
/cgi-bin/try/
HTTP/1.0'
200
3395

i can do this for this log. 我可以为此日志执行此操作。 But i want to use my code for all apache log. 但我想将我的代码用于所有Apache日志。 How can i do this ? 我怎样才能做到这一点 ?

You can use regex like this: 您可以像这样使用正则表达式:

public class Main {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("(\\d+\\.\\d+\\.\\d+\\.\\d+)\\s.*\\s.*\\s\\[(.*)\\]\\s\\'(.*)\\'\\s(.*)");
        Matcher m = p.matcher("192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] 'GET /cgi-bin/try/ HTTP/1.0' 200 3395");
        boolean b = m.matches();

        System.out.println(m.group(1));
        System.out.println(m.group(2));
        System.out.println(m.group(3));
        System.out.println(m.group(4));
    }
}

Check out following code. 查看以下代码。 Include the special characters that you don't want while tokenizing in the "delim" string of following code snippet. 在以下代码段的“ delim”字符串中包括在标记化时不需要的特殊字符。

String s = scan.nextLine();
String delim = "!,?._'@ ";
StringTokenizer st  = new StringTokenizer(s, delim);
System.out.println(st.countTokens());
while(st.hasMoreTokens()){
    System.out.println(st.nextToken());
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM