简体   繁体   English

如何在Java中使用正则表达式从String获取日期?

[英]how can I get the date from String using regex in java?

here's my code: 这是我的代码:

String t1="postby <span title=\"2011-4-5 17:22\">yesterday&nbsp;17:22</span>";
String t2="postby 2010-11-12 10:02";

I want get 2011-4-5 17:22 , 2010-11-12 10:02 from t1 or t2 ,using one regex expression 我想使用一个正则表达式从t1t2获得2011-4-5 17:22 : 2010-11-12 10:02

(input t1 or t2 ,output the date) (输入t1t2 ,输出日期)

how to do? 怎么做? (please give to me some example code,thanks) (请给我一些示例代码,谢谢)

\d{4}-\d{1,2}-\d{1,2} \d{2}:\d{2}

A few notes: 一些注意事项:

  • you will have to escape the slashes in a string: String pattern = "\\\\d{4}-\\\\d{1,2}....." 您将必须在字符串中转义斜线: String pattern = "\\\\d{4}-\\\\d{1,2}....."
  • \\d means "digit" (0-9) \\d表示“数字”(0-9)
  • {x} means "x times" {x}表示“ x次”
  • {x,y} means "at least x, but not more than y times" {x,y}表示“至少x,但不超过y次”

Reference: java.util.regex.Pattern 参考: java.util.regex.Pattern

How many false matches will you allow? 您允许多少次假匹配? Bozho already suggested the pattern 博zh 已经提出了模式

\d{4}-\d{1,2}-\d{1,2} \d{2}:\d{2}

But that matches the following questionable cases: 0000-1-1 00:00 (there is no year zero), 2011-0-1 00:00 (there is no month zero), 2011-13-1 00:00 (there is no month 13), 2011-1-32 00:00 (there is no month-day 32) 2011-12-31 24:00 (there is at most one leap second) and 2011-12-31 23:61 (there is at most one leap seond). 但这与以下可疑情况匹配: 0000-1-1 00:00 (没有零年), 2011-0-1 00:00 (没有零月), 2011-13-1 00:00 (没有是没有月13日), 2011-1-32 00:00 (没有月日32), 2011-12-31 24:00 (最多只有leap秒)和2011-12-31 23:61 (最多有一个飞跃)。

You are wanting to parse date-times that are almost , but not quite, in ISO-8601 format . 您想解析几乎但不是完全以ISO-8601格式显示的日期时间。 If you can, please use that international standard format. 如果可以, 使用该国际标准格式。

In one of my programs (a shell script using grep ), I've used the following regular expression: 在我的程序之一(使用grep的shell脚本)中,我使用了以下正则表达式:

^20[0-9][0-9]-[01][0-9]-[0-3][0-9]T[0-9][0-9]:[0-9][0-9]:[0-9][0-9]UTC$

I had an extra T and UTC to deal with, was interested only in dates in this century, and parsed with seconds precision. 我有一个额外的TUTC可以处理,只对本世纪的日期感兴趣,并且以秒为单位进行解析。 I see I was not so restrictive on hour and minute values, probably because traditional C/C++ conversions can handle them. 我看到我对小时和分钟值的限制不是那么严格,可能是因为传统的C / C ++转换可以处理它们。

I guess you therefore could use something like the following: 我想您因此可以使用以下内容:

\d{4}-[01]\d-[0-3]\d [0-2]\d:[0-6]\d

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM