简体   繁体   中英

PHP: split string based on array

Below is that data I'm trying to parse:

50‐59 1High300.00 Avg300.00
90‐99 11High222.00 Avg188.73
120‐1293High204.00 Avg169.33

The first section is a weight range , next is a count , followed by Highprice , ending with Avgprice .

As an example, I need to parse the data above into an array which would look like

[0]50-59
[1]1
[2]High300.00
[3]Avg300.00

[0]90-99
[1]11
[2]High222.00
[3]Avg188.73

[0]120‐129
[1]3
[2]High204.00
[3]Avg169.33

I thought about creating an array of what the possible weight ranges can be but I can't figure out how to use the values of the array to split the string.

$arr = array("10-19","20-29","30-39","40-49","50-59","60-69","70-79","80-89","90-99","100-109","110-119","120-129","130-139","140-149","150-159","160-169","170-179","180-189","190-199","200-209","210-219","220-229","230-239","240-249","250-259","260-269","270-279","280-289","290-299","300-309");

Any ideas would be greatly appreciated.

Hope this will work:

    $string='50-59 1High300.00 Avg300.00
    90-99 11High222.00 Avg188.73
    120-129 3High204.00 Avg169.33';

    $requiredData=array();
    $dataArray=explode("\n",$string);
    $counter=0;
    foreach($dataArray as $data)
    {
        if(preg_match('#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#', $data,$matches))    
        {
            $requiredData[$counter][]=$matches[1];
            $requiredData[$counter][]=$matches[2];
            $requiredData[$counter][]=$matches[3];
            $requiredData[$counter][]=$matches[4];
            $counter++;
        }
    }
    print_r($requiredData);
'#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#'

I don't think that will work because of the space you have in the regex between the weight and count. The thing I'm struggling with is a row like this where there is no space. 120‐1293High204.00 Avg169.33 that needs to be parsed like [0]120‐129 [1]3 [2]High204.00 [3]Avg169.33

You are right. That can be remedied by limiting the number of weight digits to three and making the space optional.

'#^(\d+-\d{1,3}) *…

$arr = array('50-59 1High300.00 Avg300.00', 
             '90-99 11High222.00 Avg188.73', 
             '120-129 3High204.00 Avg169.33');

foreach($arr as $str) {
    if (preg_match('/^(\d+-\d{1,3})\s*(\d+)(High\d+\.\d\d) (Avg\d+\.\d\d)/i', $str, $m)) {
        array_shift($m); //remove group 0 (ie. the whole match)
        $result[] = $m;
    }
}
print_r($result);

Output:

Array
(
    [0] => Array
        (
            [0] => 50-59
            [1] => 1
            [2] => High300.00
            [3] => Avg300.00
        )

    [1] => Array
        (
            [0] => 90-99
            [1] => 11
            [2] => High222.00
            [3] => Avg188.73
        )

    [2] => Array
        (
            [0] => 120-129
            [1] => 3
            [2] => High204.00
            [3] => Avg169.33
        )

)

Explanation:

/                   : regex delimiter
    ^               : begining of string
    (               : start group 1
      \d+-\d{1,3}   : 1 or more digits a dash and 1 upto 3 digits ie. weight range
    )               : end group 1
    \s*             : 0 or more space character
    (\d+)           : group 2 ie. count
    (High\d+\.\d\d) : group 3 literal High followed by price
    (Avg\d+\.\d\d)  : Group 4 literal Avg followed by price
/i                  : regex delimiter and case Insensitive modifier.

To be more generic, you could replace High and Avg by [az]+

This is a pattern you can trust ( Pattern Demo ):

/^((\d{0,2})0‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/m

The other answers overlooked the digital pattern in the weight range substring. The range start integer always ends in 0 , and the range end integer always ends in 9 ; the range always spans ten integers.

My pattern will capture the digits that precede the 0 in the starting integer and reference them immediately after the dash, then require that captured number to be followed by a 9 .

I want to point out that your sample input was a little bit tricky because your is not the standard - that is between the 0 and = on my keyboard. This was a sneaky little gotcha for me to solve.

Method ( Demo ):

$text = '50‐59 1High300.00 Avg300.00
90‐99 11High222.00Avg188.73
120‐1293High204.00 Avg169.33';

preg_match_all(
    '/^((\d{0,2})0‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/m',
    $text,
    $matches,
    PREG_SET_ORDER
);

var_export(
    array_map(
        fn($captured) => [
            'weight range' => $captured[1],
            'count' => $captured[3],
            'Highprice' => $captured[4],
            'Avgprice' => $captured[5]
        ],
        $matches
    )
);

Output:

array (
  0 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  1 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  2 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  3 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM