简体   繁体   English

在Java中的列表中拆分字符串以匹配日期格式

[英]Splitting a String to match date format in a List in Java

I have a list of strings which I am going to write to a CSV file. 我有一个要写入CSV文件的字符串列表。 The list elements has a String like this, 列表元素具有这样的字符串,

List<String> list1 = new ArrayList<String>();
list1.add("one, Aug 21, 2018 11:08:51 PDT, last");
list1.add("two, newlast, Aug 22, 2018 11:08:52 PDT");

But the problem is when I write to CSV file, "Aug 21" and "2018 11:08:51" gets separated into the different column. 但是问题是当我写入CSV文件时, "Aug 21""2018 11:08:51"被分成不同的列。

I need it like "Aug 21, 2018 11:08:51 PDT" . 我需要像"Aug 21, 2018 11:08:51 PDT"

Also, the index might change, it is not sure Aug 21 will always come at the same position in the list. 另外,索引可能会更改,因此不确定8月21日是否总是在列表中的同一位置。

I tried the below code to fix this, It is Working. 我尝试了下面的代码来解决此问题,它正在工作。 But is there any better way to fix this, (Instead of splitting to the array and iterating) 但是有什么更好的方法来解决这个问题(而不是拆分为数组并进行迭代)

list1.forEach(s -> {
        String s1[] = s.split(",");
        for(int i=0; i<s1.length; i++) {
            if(isValidMonthDate(s1[i])==true) {
                if(s1[i+1]!=null && !s1[i+1].isEmpty()) {
                    if(isValidYearTime(s1[i+1])) {
                        s1[i] = s1[i].trim();
                        System.out.println("\""+ s1[i] +","+s1[i+1]+"\""); //i will concatenate this string and write to csv
                    }
                }
            }
        }
    });
}

public static boolean isValidMonthDate(String inDate) {
    SimpleDateFormat dateFormat = new SimpleDateFormat("MMM dd");       dateFormat.setLenient(false);
    try {
        dateFormat.parse(inDate.trim());
    } catch (ParseException pe) {
        return false;
    }
    return true;
}

public static boolean isValidYearTime(String inDate) {
    SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy HH:mm:ss zzz");        
            dateFormat.setLenient(false);
    try {
        dateFormat.parse(inDate.trim());
    } catch (ParseException pe) {
        return false;
    }
    return true;
}

I am able to get output, 我能够得到输出,

"Aug 21, 2018 11:08:51 PDT"
"Aug 22, 2018 11:08:52 PDT"

Is there any better way to achieve this without splitting to aarray and iterating it. 有没有更好的方法来实现此目的而无需拆分为数组并对其进行迭代。

You could utilize the normal date parser to attempt parsing at each index using a parse position, and see where it succeeds. 您可以利用普通的日期解析器尝试使用解析位置在每个索引处进行解析,并查看成功的地方。

As I try to ignore the old date api nowadays, here's a simple demo with the new one: 如今,当我尝试忽略旧的日期api时,下面是一个带有新示例的简单演示:

public static void main(String[] args) {
    List<String> inputs = Arrays.asList(
        "Aug 21, 2018 11:08:51 PDT",
        "one, Aug 21, 2018 11:08:51 PDT, last",
        "two, newlast, Aug 22, 2018 11:08:52 PDT"
        );
    String formatPattern = "MMM dd, yyyy HH:mm:ss zzz";
    DateTimeFormatter pattern = DateTimeFormatter.ofPattern(formatPattern, Locale.US);

    for(String input : inputs) {
        System.out.println("Processing " + input);

        int[] matchStartEnd = null;
        TemporalAccessor temp = null;

        // check all possible offsets i in the input string
        for(int i = 0, n = input.length() - formatPattern.length(); i <= n; i++) {
            try {
                ParsePosition pt = new ParsePosition(i);
                temp = pattern.parse(input, pt); 
                matchStartEnd = new int[] { i, pt.getIndex() };
                break;
            }
            catch(DateTimeParseException e) {
                // ignore this
            }
        }
        if(matchStartEnd != null) {
            System.out.println("  Found match at indexes " + matchStartEnd[0] + " to " + matchStartEnd[1]);
            System.out.println("  temporal accessor is " + temp);
        }
        else {
            System.out.println("  No match");
        }
    }
}

When output, put the date in quotes. 输出时,将日期放在引号中。 That's how CSV escapes them. 这就是CSV逃脱它们的方式。

To parse your input, use a regex. 要解析您的输入,请使用正则表达式。 This one will read each date or word, and consume the comma separator 这将读取每个日期或单词,并使用逗号分隔符

(\w{3} \d{1,2}, \d{4})|(\w+),?

You can elaborate with more parenthesis to pre-parse your date. 您可以使用更多的括号来详细说明您的日期。 If the first expression matches, it's the date. 如果第一个表达式匹配,则为日期。 I will leave it to OP to order the final CSV. 我将其留给OP来订购最终的CSV。

Here the regex in Javascript for POC. 这里是用于POC的Java正则表达式。 I know the question is Java, but REGEX is same. 我知道问题是Java,但是REGEX是相同的。

 // read word or date followed by comma const rx = /(\\w{3} \\d{1,2}, \\d{4})|(\\w+),?/g const input = ['one, Aug 2, 1999, two', 'three, four, Aug 3, 2000', 'Aug 3, 2010, five, six'] let csv2 = '' input.forEach(it => { let parts = [] let m2 = rx.exec(it) while (m2) { parts.push(m2[1] || m2[2]) m2 = rx.exec(it) } csv2 += parts.map(it => '"' + it + '"').join(',') + '\\n' }) console.log(csv2) 

I suggest you to use Regex to extract the date: 我建议您使用正则表达式提取日期:

^(.*?)(\w{3} \d{1,2}, \d{4} \d{2}:\d{2}:\d{2} PDT)(.*?)$

And Stream::map to extract the date and try to parse it. 然后使用Stream::map提取日期并尝试解析它。 Don't forget to filter null values out since they didn't pass the parsing. 不要忘记将null值过滤掉,因为它们没有通过解析。

SimpleDateFormat sdf = new SimpleDateFormat("MMM dd, yyyy HH:mm:ss Z", Locale.ENGLISH);
list1.stream()
     .map(s -> { 
         try {
             return sdf.parse(s.replaceAll("^(.*?)(\\w{3} \\d{1,2}, \\d{4} \\d{2}:\\d{2}:\\d{2} PDT)(.*?)$", "$2")));
         } catch (ParseException e) {} return null; })
     .filter(Objects::nonNull)
     .forEach(System.out::println);

I suggest you wrap the try-catch and the Regex extracting into a separate method. 我建议您将try-catch和Regex提取文件包装到单独的方法中。

static SimpleDateFormat sdf = new SimpleDateFormat("MMM dd, yyyy HH:mm:ss Z", Locale.ENGLISH);

static Date validate(String date) {
    String s = date.replaceAll("^(.*?)(\\w{3} \\d{1,2}, \\d{4} \\d{2}:\\d{2}:\\d{2} PDT)(.*?)$", "$2");
    try {
        return sdf.parse(s);
    } catch (ParseException e) { }
    return null;
}

... which significantly simplifies the Stream: ...大大简化了Stream:

list1.stream()
     .map(Main::validate)
     .filter(Objects::nonNull)
     .forEach(System.out::println);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM