简体   繁体   中英

Php in_array / strpos does not work - How to find an 'exact' match?

OK..

I'm working blind on a "big" product web app...

We've got a couple of thousand products each with a bunch of data elements coming in from multiple vendors in various formats... so, needless to say, we can't see the data...

Here's the short version of today's problem...

We want to extract the "size" from the 'product name'

$product_name = "Socket Assembly w/ 25 ft Lamp Cord - 14 Gauge ";

and here's "part of the Sizes array....

$lookForTheseSizes = array( ...'Gallon','gal','Gal','G','Gram','gram','g','gm','Gauge','gauge'... );

The Sizes array, currently with around 100 values, is built dynamically and may change with new values added without notice.

So this script does not always work... as it is dependent on how the Sizes array values are ordered.

foreach ($lookForTheseSizes as $key => $value){
    if (strpos( $nameChunk,$value) !== false) { 

        echo 'match '.$nameChunk.' => '.$value.'<br/>';                 

        $size = $value; 
        break;
    }
}

For example... when $nameChunk = "Gauge" ... the script returns a "match" on 'g' first....

So... my question is this... Is there a way -regex or standard php 5.4 or better function- to do an extract find/match ... WITHOUT first sorting the Sizes array ?

$product_name = "Socket Assembly w/ 25 ft Lamp Cord - 14 Gauge ";

$lookForTheseSizes = array('Gallon', 'gal', 'Gal', 'G', 'Gram', 'gram', 'g',
                           'gm', 'Gauge', 'gauge', 'ft');
foreach($lookForTheseSizes as $unit)
{
    if (preg_match('/(?P<size>[\d.]+)\s*' . preg_quote($unit) . '\b/U', 
        $product_name, $matches))
       echo $matches['size'] . " " . $unit . "\n";
}

Result

14 Gauge
25 ft

Or..

$units = join('|' , array_map('preg_quote', $lookForTheseSizes));

if (preg_match_all('/(?P<size>[\d.]+)\s*(?P<unit>' . $units . ')\b/U',
             $product_name, $matches))
  var_dump($matches);

Look at $matches and do what you want.

  [0]=>
  array(2) {
    [0]=>
    string(5) "25 ft"
    [1]=>
    string(8) "14 Gauge"
  }
  ["size"]=>
  array(2) {
    [0]=>
    string(2) "25"
    [1]=>
    string(2) "14"
  }
  ["unit"]=>
  array(2) {
    [0]=>
    string(2) "ft"
    [1]=>
    string(5) "Gauge"
  }

I would throw out the case-sensitive repeating units from the array and use additional modifier i in regex (it will be /iU instead of /U ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM