简体   繁体   中英

Splitting a string based on Pattern Java

Hi I have log files of the following pattern-

2014-03-06 03:21:45,432 ERROR [mfs:pool-3-thread-19] dispatcher.StatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
2014-03-06 03:22:06,454 ERROR [mfs:pool-3-thread-19] dispatcher.ClientStatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
2014-03-06 03:22:27,462 ERROR [pool-1-thread-1] cluster.ClusterServiceImpl  - unexpected error when trying to update LastCheckinTime
java.sql.SQLException: Network error IOException: Connection timed out: connect
...

I am trying to split the string into substrings such that-

parsedString[0]=2014-03-06 03:21:45
parsedString[1]=,432 ERROR [mfs:pool-3-thread-19] dispatcher.StatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
parsedString[2]=2014-03-06 03:22:06
....

I tried using string.split(datepattern) but it only gives me the content in the string array and not the dates. I also tried using Pattern matcher but it only gives me a list of matching dates and not the content.

How do I get both values into the same string array. Any help would be much appreciated. Thanks

Edit- String pattern="([0-9]{4}-[0-1][0-9]-[0-3][0-9]\\s(?:[0-1][0-9]|[2][0-3]):[0-5][0-9]:[0-5][0-9],)"; String parsedLogMessage[]=GetLogString().split(pattern); this.MessageContent=Arrays.asList(parsedLogMessage);

This only gives the string split by regex and not the regex string itself

If you must use regex you could try it like this

Pattern p = Pattern.compile("(^[^,]*)(.*$)");
Matcher m = p.matcher(inputstring);
m.matches();
String part1 = m.group(1);
String part2 = m.group(2);

Then part1 should be everything up to the first comma, part2 the rest of the inputstring.

Using substring would be easier though...

This will split the string each time a comma or a \\n newline is found:

String[] parsedString = logString.split("(,|\n)");

It should produce your desired output, but there are few potential problem I foresee here:

First I have a feeling you're trying to load the whole log file into a string first. This is a good waste of memory if you will be processing them by line (what happens if the log file is 10GB?). A better approach would be to use a BufferedReader and do them per line.

Secondly keep in mind a log output can have commas in itself, so above code will be buggy. Since the prefix part seem to be fixed-length, you might want to chop them using substring instead.

Suppose your string parameters in between two special charaters like : #parameter# or parameter or even two differnt signs at a time like *paramter#. We can have list of all these parameters between those signs by this code :

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang.StringUtils;

public class Splitter {

    public static void main(String[] args) {

        String pattern1 = "#";
        String pattern2 = "#";
        String text = "(#n1_1#/#n2_2#)*2/#n1_1#*34/#n4_4#";

        Pattern p = Pattern.compile(Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2));
        Matcher m = p.matcher(text);
        while (m.find()) {
            ArrayList parameters = new ArrayList<>();
            parameters.add(m.group(1));
            System.out.println(parameters);
            ArrayList result = new ArrayList<>();
            result.add(parameters);
            // System.out.println(result.size());
        }

    }
}

Here list result will have parameters n1_1,n2_2,n4_4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM