简体   繁体   中英

Regex to match an ISO 8601 datetime string

does anyone have a good regex pattern for matching iso datetimes?

ie: 2010-06-15T00:00:00

For the strict, full datetime, including milliseconds, per the W3C's take on the spec .:

//-- Complete precision:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z)/

//-- No milliseconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z)/

//-- No Seconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z)/

//-- Putting it all together:
/(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))/

.
Additional variations allowed by the actual ISO 8601:2004(E) doc :

/********************************************
**    No time-zone varients:
*/
//-- Complete precision:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+/

//-- No milliseconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d/

//-- No Seconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d/

//-- Putting it all together:
/(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+)|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d)|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d)/

WARNING: This all gets messy fast, and it still allows certain nonsense such as a 14th month. Additionally, ISO 8601:2004(E) allows a several other variants.

.
"2010-06-15T00:00:00" isn't legal, because it doesn't have the time-zone designation.

For matching just ISO date, like 2017-09-22, you can use this regexp:

^\d{4}-([0]\d|1[0-2])-([0-2]\d|3[01])$

It will match any numeric year, any month specified by two digits in range 00-12 and any date specified by two digits in range 00-31

I reworked the top answer into something a bit more concise. Instead of writing out each of the three optional patterns, the elements are nested as optional statements.

/[+-]?\\d{4}(-[01]\\d(-[0-3]\\d(T[0-2]\\d:[0-5]\\d:?([0-5]\\d(\\.\\d+)?)?[+-][0-2]\\d:[0-5]\\dZ?)?)?)?/

I'm curious if there are downsides to this approach?

You can find tests for my suggested answer here: http://regexr.com/3e0lh

Here is a regular expression to check ISO 8601 date format including leap years and short-long months. To run this, you'll need to "ignore white-space". A compacted version without white-space is on regexlib: http://regexlib.com/REDetails.aspx?regexp_id=3344

There's more to ISO 8601 - this regex only cares for dates, but you can easily extend it to support time validation which is not that tricky.

Update: This works now with javascript (without lookbehinds)

  ^(?:
      (?=
            [02468][048]00
            |[13579][26]00
            |[0-9][0-9]0[48]
            |[0-9][0-9][2468][048]
            |[0-9][0-9][13579][26]              
      )

      \d{4}

      (?:

        (-|)

        (?:

            (?:
                00[1-9]
                |0[1-9][0-9]
                |[1-2][0-9][0-9]
                |3[0-5][0-9]
                |36[0-6]
            )
            |
                (?:01|03|05|07|08|10|12)
                (?:
                  \1
                  (?:0[1-9]|[12][0-9]|3[01])
                )?            
            |
                (?:04|06|09|11)
                (?:
                  \1
                  (?:0[1-9]|[12][0-9]|30)
                )?            
            |
                02
                (?:
                  \1
                  (?:0[1-9]|[12][0-9])
                )?

            |
                W(?:0[1-9]|[1-4][0-9]|5[0-3])
                (?:
                  \1
                  [1-7]
                )?

        )            
      )?
  )$
  |
  ^(?:
      (?!
            [02468][048]00
            |[13579][26]00
            |[0-9][0-9]0[48]
            |[0-9][0-9][2468][048]
            |[0-9][0-9][13579][26]              
      )

      \d{4}

      (?:

        (-|)

        (?:

            (?:
                00[1-9]
                |0[1-9][0-9]
                |[1-2][0-9][0-9]
                |3[0-5][0-9]
                |36[0-5]
            )
            |
                (?:01|03|05|07|08|10|12)
                (?:
                  \2
                  (?:0[1-9]|[12][0-9]|3[01])
                )?

            |
                (?:04|06|09|11)
                (?:
                  \2
                  (?:0[1-9]|[12][0-9]|30)
                )?
            |
                (?:02)
                (?:
                  \2
                  (?:0[1-9]|1[0-9]|2[0-8])
                )?
            |
                W(?:0[1-9]|[1-4][0-9]|5[0-3])
                (?:
                  \2
                  [1-7]
                )?
       ) 
    )?
)$

To cater for time, add something like this to the mixture (from: http://underground.infovark.com/2008/07/22/iso-date-validation-regex/ ):

([T\s](([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)?(\15([0-5]\d))?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?

The ISO 8601 specification allows a wide variety of date formats. There's a mediocre explanation as to how to do it here . There is a fairly minor discrepancy between how Javascript's date input formatting and the ISO formatting for simple dates which do not specify timezones, and it can be easily mitigated using a string substitution . Fully supporting the ISO-8601 specification is non-trivial.

Here is a reference example which I do not guarantee to be complete, although it parses the non-duration dates from the aforementioned Wikipedia page.

Below is an example, and you can also see it's output on ideone . Unfortunately, it does not work to specification as it does not properly implement weeks. The definition of the week number 01 in ISO-8601 is non-trivial and requires some browsing the calendar to determine where week one begins, and what exactly it means in terms of the number of days in the specified year. This can probably be fairly easily corrected (I'm just tired of playing with it).

function parseISODate (input) {
    var iso = /^(\d{4})(?:-?W(\d+)(?:-?(\d+)D?)?|(?:-(\d+))?-(\d+))(?:[T ](\d+):(\d+)(?::(\d+)(?:\.(\d+))?)?)?(?:Z(-?\d*))?$/;

    var parts = input.match(iso);

    if (parts == null) {
        throw new Error("Invalid Date");
    }

    var year = Number(parts[1]);

    if (typeof parts[2] != "undefined") {
        /* Convert weeks to days, months 0 */
        var weeks = Number(parts[2]) - 1;
        var days  = Number(parts[3]);

        if (typeof days == "undefined") {
            days = 0;
        }

        days += weeks * 7;

        var months = 0;
    }
    else {
        if (typeof parts[4] != "undefined") {
            var months = Number(parts[4]) - 1;
        }
        else {
            /* it's an ordinal date... */
            var months = 0;
        }

        var days   = Number(parts[5]);
    }

    if (typeof parts[6] != "undefined" &&
        typeof parts[7] != "undefined")
    {
        var hours        = Number(parts[6]);
        var minutes      = Number(parts[7]);

        if (typeof parts[8] != "undefined") {
            var seconds      = Number(parts[8]);

            if (typeof parts[9] != "undefined") {
                var fractional   = Number(parts[9]);
                var milliseconds = fractional / 100;
            }
            else {
                var milliseconds = 0
            }
        }
        else {
            var seconds      = 0;
            var milliseconds = 0;
        }
    }
    else {
        var hours        = 0;
        var minutes      = 0;
        var seconds      = 0;
        var fractional   = 0;
        var milliseconds = 0;
    }

    if (typeof parts[10] != "undefined") {
        /* Timezone adjustment, offset the minutes appropriately */
        var localzone = -(new Date().getTimezoneOffset());
        var timezone  = parts[10] * 60;

        minutes = Number(minutes) + (timezone - localzone);
    }

    return new Date(year, months, days, hours, minutes, seconds, milliseconds);
}

print(parseISODate("2010-06-29T15:33:00Z-7"))
print(parseISODate("2010-06-29 06:14Z"))
print(parseISODate("2010-06-29T06:14Z"))
print(parseISODate("2010-06-29T06:14:30.2034Z"))
print(parseISODate("2010-W26-2"))
print(parseISODate("2010-180"))

I have made this regex and solves the validation for dates as they come out of Javascript's .toISOString() method.

^[0-9]{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|(02)-(0[1-9]|[12][0-9]))T(0[0-9]|1[0-9]|2[0-3]):(0[0-9]|[1-5][0-9]):(0[0-9]|[1-5][0-9])\\.[0-9]{3}Z$

Contemplated:

  • Proper symbols ('-', 'T', ':', '.', 'Z') in proper places.
  • Consistency with months of 29, 30 or 31 days.
  • Hours from 00 to 23.
  • Minutes and seconds from 00 to 59.
  • Milliseconds from 000 to 999.

Not contemplated:

  • Leap years.

Example date: 2019-11-15T13:34:22.178Z

Example to run directly in Chrome console: /^[0-9]{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|(02)-(0[1-9]|[12][0-9]))T(0[0-9]|1[0-9]|2[0-3]):(0[0-9]|[1-5][0-9]):(0[0-9]|[1-5][0-9])\\.[0-9]{3}Z$/.test("2019-11-15T13:34:22.178Z");

Regex flow diagram ( Regexper ): 正则表达式流程图

yyyy-MM-dd

Too much explanation for most of the answers here, here's a short variation of @Sergey answer addressing some weird scenarios (like 2020-00-00 ), this RegExp only cares about the yyyy-MM-dd date:

// yyyy-MM-dd
^\d{4}-([0][1-9]|1[0-2])-([0-2][1-9]|[1-3]0|3[01])$

Also this one doesn't care about the number of days per month, like 2020-11-31 (because November has only 30 days).

My use-case was to convert a String into a Date (from an API param) and I needed only to know that the input string didn't contained strange stuff, I do the next validation against an actual Date object.

从 1900 年到 2999 年的 02/29 验证

 (((2000|2400|2800|((19|2[0-9])(0[48]|[2468][048]|[13579][26])))-02-29)|(((19|2[0-9])[0-9]{2})-02-(0[1-9]|1[0-9]|2[0-8]))|(((19|2[0-9])[0-9]{2})-(0[13578]|10|12)-(0[1-9]|[12][0-9]|3[01]))|(((19|2[0-9])[0-9]{2})-(0[469]|11)-(0[1-9]|[12][0-9]|30)))T([01][0-9]|[2][0-3]):[0-5][0-9]:[0-5][0-9]\.[0-9]{3}Z

Brocks answers are good, but should start with ^ and end with $ so as not to allow prefix/suffix characters if all you are trying to match is the date string alone.

While using QRegExp with IsoDateWithMs the millisecond ones here did not work. instead the following saved the day.

\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d{1,3}

(I know this is a JS entry but it pops up first and would be helpful for c++ devs)

Here is my take on this:

^\d{4}-(?:0[1-9]|1[0-2])-(?:[0-2][1-9]|[1-3]0|3[01])T(?:[0-1][0-9]|2[0-3])(?::[0-6]\d)(?::[0-6]\d)?(?:\.\d{3})?(?:[+-][0-2]\d:[0-5]\d|Z)?$

Examples for a match:

2016-12-31T23:59:60+12:30
2021-05-10T09:05:12.000Z
3015-01-01T23:00+02:00
1001-01-31T23:59:59Z
2023-12-20T20:20

The minutes and seconds part could be refined more, but this is good enough for me.

Regexper

在此处输入图像描述

Not sure if it's relevant to the underlying problem you are trying to solve, but you can pass an ISO date string as a constructor arg to Date() and get an object out of it. The constructor is actually very flexible in terms of coercing a string into a Date.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM