简体   繁体   English

使用正则表达式从日期格式字符串中删除元素

[英]Remove elements from Date Format String using a Regular Expression

I want to remove elements a supplied Date Format String - for example convert the format "dd/MM/yyyy" to "MM/yyyy" by removing any non-M/y element. 我想删除提供的日期格式字符串的元素-例如,通过删除任何非M / y元素,将格式“ dd / MM / yyyy”转换为“ MM / yyyy”。

What I'm trying to do is create a localised month/year format based on the existing day/month/year format provided for the Locale. 我想做的是根据为语言环境提供的现有日/月/年格式创建本地化的月/年格式。

I've done this using regular expressions, but the solution seems longer than I'd expect. 我已经使用正则表达式完成了此操作,但是该解决方案似乎比我期望的要长。

An example is below: 下面是一个示例:

public static void main(final String[] args) {
 System.out.println(filterDateFormat("dd/MM/yyyy HH:mm:ss", 'M', 'y'));
 System.out.println(filterDateFormat("MM/yyyy/dd", 'M', 'y'));
 System.out.println(filterDateFormat("yyyy-MMM-dd", 'M', 'y'));
}

/**
 * Removes {@code charsToRetain} from {@code format}, including any redundant
 * separators.
 */
private static String filterDateFormat(final String format, final char...charsToRetain) {
 // Match e.g. "ddd-"
 final Pattern pattern = Pattern.compile("[" + new String(charsToRetain) + "]+\\p{Punct}?");
 final Matcher matcher = pattern.matcher(format);

 final StringBuilder builder = new StringBuilder();

 while (matcher.find()) {
  // Append each match
  builder.append(matcher.group());
 }

 // If the last match is "mmm-", remove the trailing punctuation symbol
 return builder.toString().replaceFirst("\\p{Punct}$", "");
}

Let's try a solution for the following date format strings: 让我们尝试以下日期格式字符串的解决方案:

String[] formatStrings = { "dd/MM/yyyy HH:mm:ss", 
                           "MM/yyyy/dd", 
                           "yyyy-MMM-dd", 
                           "MM/yy - yy/dd", 
                           "yyabbadabbadooMM" };

The following will analyze strings for a match, then print the first group of the match. 以下内容将分析匹配项的字符串,然后打印匹配项的第一组。

Pattern p = Pattern.compile(REGEX);
for(String formatStr : formatStrings) {
    Matcher m = p.matcher(formatStr);
    if(m.matches()) {
        System.out.println(m.group(1));
    }
    else {
        System.out.println("Didn't match!");
    }
}

Now, there are two separate regular expressions I've tried. 现在,我尝试了两个单独的正则表达式。 First: 第一:

final String REGEX = "(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)";

With program output: 带有程序输出:

MM/yyyy MM / yyyy
MM/yyyy MM / yyyy
yyyy-MMM yyyy-MMM
Didn't match! 不匹配!
Didn't match! 不匹配!

Second: 第二:

final String REGEX = "(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*)";

With program output: 带有程序输出:

MM/yyyy MM / yyyy
MM/yyyy MM / yyyy
yyyy-MMM yyyy-MMM
MM/yy - yy MM / yy-yy
Didn't match! 不匹配!

Now, let's see what the first regex actually matches to: 现在,让我们看一下第一个正则表达式实际匹配的内容:

(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*) First regex =
(?:[^My]*)                              Any amount of non-Ms and non-ys (non-capturing)
          ([My]+                        followed by one or more Ms and ys
                [^\\w]*                 optionally separated by non-word characters
                                        (implying they are also not Ms or ys)
                       [My]+)           followed by one or more Ms and ys
                             (?:[^My]*) finished by any number of non-Ms and non-ys
                                        (non-capturing)

What this means is that at least 2 M/ys are required to match the regex, although you should be careful that something like MM-dd or yy-DD will match as well, because they have two M-or-y regions 1 character long. 这意味着需要至少2 M / ys来匹配正则表达式,尽管您应注意MM-dd或yy-DD之类的东西也要匹配,因为它们有两个M-or-y区域,每个字符1个字符长。 You can avoid getting into trouble here by just keeping a sanity check on your date format string, such as: 您可以通过仅对日期格式字符串进行完整性检查来避免麻烦,例如:

if(formatStr.contains('y') && formatStr.contains('M') && m.matches())
{
    String yMString = m.group(1);
    ... // other logic
}

As for the second regex, here's what it means: 至于第二个正则表达式,这是什么意思:

(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*) Second regex =
(?:[^My]*)                                   Any amount of non-Ms and non-ys 
                                             (non-capturing)
          (                      )           followed by
           (?:[My]+       )+[My]+            at least two text segments consisting of
                                             one or more Ms or ys, where each segment is
                   [^\\w]*                   optionally separated by non-word characters
                                  (?:[^My]*) finished by any number of non-Ms and non-ys
                                             (non-capturing)

This regex will match a slightly broader series of strings, but it still requires that any separations between Ms and ys be non-words ( [^a-zA-Z_0-9] ). 此正则表达式将匹配更广泛的字符串系列,但仍要求Ms和ys之间的任何分隔都是非单词( [^a-zA-Z_0-9] )。 Additionally, keep in mind that this regex will still match "yy", "MM", or similar strings like "yyy", "yyyy"..., so it would be useful to have a sanity check as described for the previous regular expression. 此外,请记住,此正则表达式仍将匹配“ yy”,“ MM”或类似的字符串(如“ yyy”,“ yyyy” ...),因此,如先前常规中所述进行完整性检查将很有用。表达。

Additionally, here's a quick example of how one might use the above to manipulate a single date format string: 此外,这是一个简单的示例,说明了如何使用上述方法来操作单个日期格式字符串:

LocalDateTime date = LocalDateTime.now();
String dateFormatString = "dd/MM/yyyy H:m:s";
System.out.println("Old Format: \"" + dateFormatString + "\" = " + 
    date.format(DateTimeFormatter.ofPattern(dateFormatString)));
Pattern p = Pattern.compile("(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)");
Matcher m = p.matcher(dateFormatString);
if(dateFormatString.contains("y") && dateFormatString.contains("M") && m.matches())
{
    dateFormatString = m.group(1);
    System.out.println("New Format: \"" + dateFormatString + "\" = " + 
        date.format(DateTimeFormatter.ofPattern(dateFormatString)));
}
else
{
    throw new IllegalArgumentException("Couldn't shorten date format string!");
}

Output: 输出:

Old Format: "dd/MM/yyyy H:m:s" = 14/08/2019 16:55:45 旧格式:“ dd / MM / yyyy H:m:s” = 14/08/2019 16:55:45
New Format: "MM/yyyy" = 08/2019 新格式:“ MM / yyyy” = 08/2019

I'll try to answer with the understanding of my question : how do I remove from a list/table/array of String, elements that does not exactly follow the patern 'dd/MM'. 我将以对我的问题的理解来回答:如何从字符串的列表/表/数组中删除不完全遵循模式dd / MM的元素。

so I'm looking for a function that looks like 所以我正在寻找一个看起来像

public List<String> removeUnWantedDateFormat(List<String> input)

We can expect, from my knowledge on Dateformat, only 4 possibilities that you would want, hoping i dont miss any, which are "MM/yyyy", "MMM/yyyy", "MM/yy", "MM/yyyy". 根据我对Dateformat的了解,我们可以希望只有4种可能性,希望我不会错过任何一种可能性,即“ MM / yyyy”,“ MMM / yyyy”,“ MM / yy”,“ MM / yyyy”。 So that we know what we are looking for we can do an easy function. 这样我们就知道我们在寻找什么,我们可以做一个简单的功能。

public List<String> removeUnWantedDateFormat(List<String> input) {
  String s1 = "MM/yyyy";
  string s2 = "MMM/yyyy";
  String s3 = "MM/yy";
  string s4 = "MMM/yy";

  for (String format:input) {
    if (!s1.equals(format) && s2.equals(format) && s3.equals(format) && s4.equals(format))
      input.remove(format);
  }
  return input;
}

Better not to use regex if you can, it costs a lot of resources. 如果可以的话,最好不要使用正则表达式,因为它会占用大量资源。 And great improvement would be to use an enum of the date format you accept, like this you have better control over it, and even replace them. 巨大的改进将是使用您接受的日期格式的枚举,这样您可以更好地控制它,甚至替换它们。

Hope this will help, cheers 希望这会有所帮助,欢呼

edit: after i saw the comment, i think it would be better to use contains instead of equals, should work like a charm and instead of remove, 编辑:看到评论后,我认为最好使用contains而不是equals,应该像魅力一样工作,而不是remove,

input = string expected. 输入=预期的字符串。

so it would looks more like: 所以看起来更像是:

public List<String> removeUnWantedDateFormat(List<String> input) {
  List<String> comparaisons = new ArrayList<>();
  comparaison.add("MMM/yyyy");
  comparaison.add("MMM/yy");
  comparaison.add("MM/yyyy");
  comparaison.add("MM/yy");

  for (String format:input) {
    for(String comparaison: comparaisons)
      if (format.contains(comparaison)) {
      format = comparaison;
      break;
    }
  }
  return input;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM