简体   繁体   English

根据Pattern Java分割字符串

[英]Splitting a string based on Pattern Java

Hi I have log files of the following pattern- 嗨,我有以下模式的日志文件-

2014-03-06 03:21:45,432 ERROR [mfs:pool-3-thread-19] dispatcher.StatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
2014-03-06 03:22:06,454 ERROR [mfs:pool-3-thread-19] dispatcher.ClientStatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
2014-03-06 03:22:27,462 ERROR [pool-1-thread-1] cluster.ClusterServiceImpl  - unexpected error when trying to update LastCheckinTime
java.sql.SQLException: Network error IOException: Connection timed out: connect
...

I am trying to split the string into substrings such that- 我正在尝试将字符串拆分为子字符串,以便-

parsedString[0]=2014-03-06 03:21:45
parsedString[1]=,432 ERROR [mfs:pool-3-thread-19] dispatcher.StatusNotification  - Error processing notification. Operation aborted.
java.sql.SQLException: Network error IOException: Connection timed out: connect
parsedString[2]=2014-03-06 03:22:06
....

I tried using string.split(datepattern) but it only gives me the content in the string array and not the dates. 我尝试使用string.split(datepattern)但是它只给我字符串数组中的内容,而不是日期。 I also tried using Pattern matcher but it only gives me a list of matching dates and not the content. 我也尝试使用模式匹配器,但是它只给我一个匹配日期列表,而不是内容列表。

How do I get both values into the same string array. 如何将两个值都放入同一个字符串数组中。 Any help would be much appreciated. 任何帮助将非常感激。 Thanks 谢谢

Edit- String pattern="([0-9]{4}-[0-1][0-9]-[0-3][0-9]\\s(?:[0-1][0-9]|[2][0-3]):[0-5][0-9]:[0-5][0-9],)"; 编辑-字符串模式=“([[0-9] {4}-[0-1] [0-9]-[0-3] [0-9] \\ s(?:[0-1] [0- 9] | [2] [0-3]):[0-5] [0-9]:[0-5] [0-9],)“; String parsedLogMessage[]=GetLogString().split(pattern); 字符串parsedLogMessage [] = GetLogString()。split(pattern); this.MessageContent=Arrays.asList(parsedLogMessage); this.MessageContent = Arrays.asList(parsedLogMessage);

This only gives the string split by regex and not the regex string itself 这仅提供由正则表达式分割的字符串,而不是正则表达式字符串本身

If you must use regex you could try it like this 如果必须使用正则表达式,可以这样尝试

Pattern p = Pattern.compile("(^[^,]*)(.*$)");
Matcher m = p.matcher(inputstring);
m.matches();
String part1 = m.group(1);
String part2 = m.group(2);

Then part1 should be everything up to the first comma, part2 the rest of the inputstring. 然后, part1应该是直到第一个逗号为止的所有内容, part2应该是输入part2的其余部分。

Using substring would be easier though... 尽管使用substring会更容易...

This will split the string each time a comma or a \\n newline is found: 每次发现逗号或\\n换行符时,都会拆分字符串:

String[] parsedString = logString.split("(,|\n)");

It should produce your desired output, but there are few potential problem I foresee here: 它应该产生所需的输出,但是我预见到这里几乎没有潜在的问题:

First I have a feeling you're trying to load the whole log file into a string first. 首先,我感觉到您正在尝试首先将整个日志文件加载到字符串中。 This is a good waste of memory if you will be processing them by line (what happens if the log file is 10GB?). 如果按行处理它们,这将浪费大量内存(如果日志文件为10GB,会发生什么情况?)。 A better approach would be to use a BufferedReader and do them per line. 更好的方法是使用BufferedReader并按行执行。

Secondly keep in mind a log output can have commas in itself, so above code will be buggy. 其次,请记住,日志输出本身可以包含逗号,因此上面的代码有问题。 Since the prefix part seem to be fixed-length, you might want to chop them using substring instead. 由于前缀部分似乎是固定长度的,因此您可能需要使用子字符串来将它们切碎。

Suppose your string parameters in between two special charaters like : #parameter# or parameter or even two differnt signs at a time like *paramter#. 假设您的字符串参数位于两个特殊字符之间,例如:#parameter#或parameter或什至两个不同符号(例如* paramter#)之间。 We can have list of all these parameters between those signs by this code : 我们可以通过以下代码在这些符号之间列出所有这些参数:

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang.StringUtils;

public class Splitter {

    public static void main(String[] args) {

        String pattern1 = "#";
        String pattern2 = "#";
        String text = "(#n1_1#/#n2_2#)*2/#n1_1#*34/#n4_4#";

        Pattern p = Pattern.compile(Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2));
        Matcher m = p.matcher(text);
        while (m.find()) {
            ArrayList parameters = new ArrayList<>();
            parameters.add(m.group(1));
            System.out.println(parameters);
            ArrayList result = new ArrayList<>();
            result.add(parameters);
            // System.out.println(result.size());
        }

    }
}

Here list result will have parameters n1_1,n2_2,n4_4. 此处的列表结果将具有参数n1_1,n2_2,n4_4。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM