[英]How to parse a date from a URL format?
我的数据库包含存储为文本字段的URL,并且每个URL都包含报告日期的表示形式,而报告本身缺少该日期。
因此,我需要将日期从URL字段解析为String表示形式,例如:
2010-10-12
2007-01-03
2008-02-07
提取日期的最佳方法是什么?
有些格式如下:
http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html
http://e.com/data/invoices/2010/09/invoices-report-thursday-september-2-2010.html
http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-15-2010.html
http://e.com/data/invoices/2010/09/invoices-report-monday-september-13th-2010.html
http://e.com/data/invoices/2010/08/invoices-report-monday-august-30th-2010.html
http://e.com/data/invoices/2009/05/invoices-report-friday-may-8th-2009.html
http://e.com/data/invoices/2010/10/invoices-report-wednesday-october-6th-2010.html
http://e.com/data/invoices/2010/09/invoices-report-tuesday-september-21-2010.html
请注意,在以下两种情况下,在一天的第二天后不一致使用th
:
http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-15-2010.html
http://e.com/data/invoices/2010/09/invoices-report-monday-september-13th-2010.html
其他格式则是这种格式 (日期开始前带有三个连字符,结尾处没有年份,并且可以选择使用invoices-
在report
之前):
http://e.com/data/invoices/2010/09/invoices-report---wednesday-september-1.html
http://e.com/data/invoices/2010/09/invoices-report---thursday-september-2.html
http://e.com/data/invoices/2010/09/invoices-report---wednesday-september-15.html
http://e.com/data/invoices/2010/09/invoices-report---monday-september-13.html
http://e.com/data/invoices/2010/08/report---monday-august-30.html
http://e.com/data/invoices/2009/05/report---friday-may-8.html
http://e.com/data/invoices/2010/10/report---wednesday-october-6.html
http://e.com/data/invoices/2010/09/report---tuesday-september-21.html
您想要这样的正则表达式:
"^http://e.com/data/invoices/(\\d{4})/(\\d{2})/\\D+(\\d{1,2})"
这利用了URL的/ year / month /部分中的所有内容始终是相同的,并且直到一个月的一天都没有数字。 有了这些之后,您将不再关心其他任何事情。
第一个捕获组是年份,第二个捕获组是月份,第三个捕获组是一天。 这一天可能没有前导零; 从字符串转换为整数并根据需要设置格式,或者只是获取字符串长度,如果不是两个,则将其连接为字符串“ 0”。
举个例子:
import java.util.regex.*;
class URLDate {
public static void
main(String[] args) {
String text = "http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html";
String regex = "http://e.com/data/invoices/(\\d{4})/(\\d{2})/\\D+(\\d{1,2})";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(text);
if (m.find()) {
int count = m.groupCount();
System.out.format("matched with groups:\n", count);
for (int i = 0; i <= count; ++i) {
String group = m.group(i);
System.out.format("\t%d: %s\n", i, group);
}
} else {
System.out.println("failed to match!");
}
}
}
给出输出:
matched with groups:
0: http://e.com/data/invoices/2010/09/invoices-report-wednesday-september-1st-2010.html
1: 2010
2: 09
3: 1
(请注意,要使用Matcher.matches()
而不是Matcher.find()
,您必须将.*$
附加到模式中,以使模式Matcher.find()
整个输入字符串。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.