简体   繁体   中英

Date regular expression in PHP

I'm running into some troubles writing a regex (I'm not proficient at it) and have not been to figure what my next step should be. What I'm trying to do is to extract some blocks of text into an array using PHP. The text looks like:

Saturday, August 03, 2013  
DUMP                   Pickup:   LITTLE ROCK, AR  
Dest:  CALDWELL, TX   
HOPPER                Pickup:   BEECH GROVE, IN  
Dest:  TERRE HAUTE, IN  
Sunday, August 04, 2013  
HOPPER                Pickup:   JONESBORO, AR  
Dest:  BATTLE CREEK, MI  
LIVE BOTTOM         Pickup:   JONESBORO, AR  
Dest:  TAYLOR, MO

Now because formatting I cant show all the spaces, for instance between DUMP and Pickup there are about 3 tabs worth of spaces.

So what I want is to put the blocks including dates into an array. Using ^(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day,(.*) gives me just the lines with the date on it and using ((.|\\n)*) instead of (.*) selects all of it. How can I make this regex to select from the date all the way to the last entry before a new date appears assuming n number of entries.

You can use this code:

$s = <<< EOF
Saturday, August 03, 2013
DUMP Pickup: LITTLE ROCK, AR
Dest: CALDWELL, TX
HOPPER Pickup: BEECH GROVE, IN
Dest: TERRE HAUTE, IN
Sunday, August 04, 2013
HOPPER Pickup: JONESBORO, AR
Dest: BATTLE CREEK, MI
LIVE BOTTOM Pickup: JONESBORO, AR
Dest: TAYLOR, MO
EOF;
if (preg_match_all(
  "~(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day,(.+?)(?=\n(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)day,|$)~s", $s, $arr))
   var_dump($arr[0]);

OUTPUT

array(2) {
  [0]=>
  string(126) "Saturday, August 03, 2013
DUMP Pickup: LITTLE ROCK, AR
Dest: CALDWELL, TX
HOPPER Pickup: BEECH GROVE, IN
Dest: TERRE HAUTE, IN"
  [1]=>
  string(126) "Sunday, August 04, 2013
HOPPER Pickup: JONESBORO, AR
Dest: BATTLE CREEK, MI
LIVE BOTTOM Pickup: JONESBORO, AR
Dest: TAYLOR, MO"
}

With regex I always play here first: http://regexpal.com/

Then you will need to use - array preg_match

preg_match('/(^\w+day).+(\d{1,2})/', $str, $matches);

print_r($matches);

It should print you array:

 Saturday and dates ...

Each related chunk is it's own array, with the date always being 0 and the others predictable as well. A little strstr(), etc or explode() can get similar results from each line.

$lines = file($filename);
$chunks = array_chunk($lines, 5);
print_r($chunks);

Array
(
    [0] => Array
        (
            [0] => Saturday, August 03, 2013
            [1] => DUMP                   Pickup:   LITTLE ROCK, AR
            [2] => Dest:  CALDWELL, TX
            [3] => HOPPER                Pickup:   BEECH GROVE, IN
            [4] => Dest:  TERRE HAUTE, IN
        )

    [1] => Array
        (
            [0] => Sunday, August 04, 2013
            [1] => HOPPER                Pickup:   JONESBORO, AR
            [2] => Dest:  BATTLE CREEK, MI
            [3] => LIVE BOTTOM         Pickup:   JONESBORO, AR
            [4] => Dest:  TAYLOR, MO
        )

)

I agreed that a parser should be written, and im bored so this is what i came up with:

function parse_( $str ) {
    $data = array();
    foreach( explode( "\n", $str ) as $line ) {
        if ( strpos( $line, ':' ) === false ) {
            $date = $line;
        }
        elseif( stripos( $line, 'pickup:' ) ) {
            $string = $line;
        }
        else {
            $data[$date][] = $string . ' -> ' . explode( ': ', $line )[1];
        }
    }
    return $data;
}

print_r( parse_( $str ) );

Output

Array
(
    [Saturday, August 03, 2013] => Array
        (
            [0] => DUMP Pickup: LITTLE ROCK, AR -> CALDWELL, TX
            [1] => HOPPER Pickup: BEECH GROVE, IN -> TERRE HAUTE, IN
        )

    [Sunday, August 04, 2013] => Array
        (
            [0] => HOPPER Pickup: JONESBORO, AR -> BATTLE CREEK, MI
            [1] => LIVE BOTTOM Pickup: JONESBORO, AR -> TAYLOR, MO
        )

)

Loops every line using strpos to figure out which "type" of line it is.

If youre using php < 5.4 (i believe) you'll have to change the last else and add a first step exploding the data.

http://ideone.com/heb4ty

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM