如何從URL格式解析日期？

Question

我的數據庫包含存儲為文本字段的URL，並且每個URL都包含報告日期的表示形式，而報告本身缺少該日期。

因此，我需要將日期從URL字段解析為String表示形式，例如：

2010-10-12
2007-01-03
2008-02-07

提取日期的最佳方法是什么？

有些格式如下：

http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html

http://e.com/data/invoices/2010/09/invoices-report-thursday-september-2-2010.html

http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-15-2010.html

http://e.com/data/invoices/2010/09/invoices-report-monday-september-13th-2010.html

http://e.com/data/invoices/2010/08/invoices-report-monday-august-30th-2010.html

http://e.com/data/invoices/2009/05/invoices-report-friday-may-8th-2009.html

http://e.com/data/invoices/2010/10/invoices-report-wednesday-october-6th-2010.html

http://e.com/data/invoices/2010/09/invoices-report-tuesday-september-21-2010.html

請注意，在以下兩種情況下，在一天的第二天后不一致使用th ：

http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-15-2010.html

http://e.com/data/invoices/2010/09/invoices-report-monday-september-13th-2010.html

其他格式則是這種格式 （日期開始前帶有三個連字符，結尾處沒有年份，並且可以選擇使用invoices-在report之前）：

http://e.com/data/invoices/2010/09/invoices-report---wednesday-september-1.html

http://e.com/data/invoices/2010/09/invoices-report---thursday-september-2.html

http://e.com/data/invoices/2010/09/invoices-report---wednesday-september-15.html

http://e.com/data/invoices/2010/09/invoices-report---monday-september-13.html

http://e.com/data/invoices/2010/08/report---monday-august-30.html

http://e.com/data/invoices/2009/05/report---friday-may-8.html

http://e.com/data/invoices/2010/10/report---wednesday-october-6.html

http://e.com/data/invoices/2010/09/report---tuesday-september-21.html

Answer 1

您想要這樣的正則表達式：

"^http://e.com/data/invoices/(\\d{4})/(\\d{2})/\\D+(\\d{1,2})"

這利用了URL的/ year / month /部分中的所有內容始終是相同的，並且直到一個月的一天都沒有數字。 有了這些之后，您將不再關心其他任何事情。

第一個捕獲組是年份，第二個捕獲組是月份，第三個捕獲組是一天。 這一天可能沒有前導零； 從字符串轉換為整數並根據需要設置格式，或者只是獲取字符串長度，如果不是兩個，則將其連接為字符串“ 0”。

舉個例子：

import java.util.regex.*;

class URLDate {
  public static void
  main(String[] args) {
    String text = "http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html";
    String regex = "http://e.com/data/invoices/(\\d{4})/(\\d{2})/\\D+(\\d{1,2})";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(text);
    if (m.find()) {
      int count = m.groupCount();
      System.out.format("matched with groups:\n", count);
      for (int i = 0; i <= count; ++i) {
          String group = m.group(i);
          System.out.format("\t%d: %s\n", i, group);
      }
    } else {
      System.out.println("failed to match!");
    }
  }
}

給出輸出：

matched with groups:
    0: http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html
    1: 2010
    2: 09
    3: 1

（請注意，要使用Matcher.matches()而不是Matcher.find() ，您必須將.*$附加到模式中，以使模式Matcher.find()整個輸入字符串。）

如何從URL格式解析日期？

問題描述

1 個解決方案

解決方案1
6 已采納 2010-10-19 17:21:35

如何從URL格式解析日期？

問題描述

1 個解決方案

解決方案1 6 已采納 2010-10-19 17:21:35

解決方案1
6 已采納 2010-10-19 17:21:35