简体   繁体   中英

Regex for matching dates

How can I find the validity start- and enddate with a regular expression from this piece of xml?

<Response>
  <Identification v="XXXXX"/>
  <Type v="YYY"/>
  <CreationDateTime v="2013-01-18T10:00:00Z"/>
  <ValidityPeriod v="2013-01-21T05:00Z/2013-01-22T05:00Z"/>
 <The rest of the file i'm not iterested in..../>

So far I found [1-9][0-9]{3}-.+?T.+?Z/.+?Z to find the value of the attribute and split the string in two date strings. Or use [1-9][0-9]{3}-.+?T[^.]+?(Z|[+-].+) and find three dates and only use the last two

But how do I find exact two matches with two separate dates.

I have to extract some xml files from an archive (with a lot and large XML files) and for performance reasons I can't deserialize all the files.

Use JDOM or another XML parsing language instead of regular expressions. It will simplify parsing this text. Alternatively, you know that the element is named "CreationDateTime", you know that the attribute is named "v", and you know that the value is enclosed within double quotes. You can use all of that information to your advantage to parse it using String splitting to more easily get the lines and values you're interested in.

Try:

my $d = qr([1-9][0-9]{3}-.+?T.+?Z);
my ($d1, $d2) = ($xml =~ /ValidityPeriod v=\"($d)\/($d)\"/);
print "$d1 $d2\n" if $d1;

The $d regexp can be as complicated as you want. ".*" would be enough :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM