简体   繁体   中英

how to find multidimensional array duplicates in php?

i have this multidimensional array

[
    ['name' => 'test1', 'number' => '9999'],
    ['name' => 'test2', 'number' => '9999'],
    ['name' => 'test3', 'number' => '5555'],
    ['name' => 'test4', 'number' => '6666'],
    ['name' => 'test5', 'number' => '6666'],
    ['name' => 'test6', 'number' => '6666'],
]

my plan is to get it to where the duplicates are pointed out except for the first instance

[
    ['name' => 'test1', 'number' => '9999'],
    ['name' => 'test2', 'number' => '9999', 'duplicate' => 'yes'],
    ['name' => 'test3', 'number' => '5555'],
    ['name' => 'test4', 'number' => '6666'],
    ['name' => 'test5', 'number' => '6666', 'duplicate' => 'yes'],
    ['name' => 'test6', 'number' => '6666', 'duplicate' => 'yes'],
]

notice that the first duplicate is excluded and only any other duplicates are marked as duplicates

if it helps, i have a separate array that i run array_count_values on, and get

array (size=3)
  9999 => int 2
  5555 => int 1
  6666 => int 3

so i know how many duplicates are for each number

foreach initial array i could test if the numbers match the ones from the array_count_values array and see if the count > 1 then mark them as duplicates, but im not sure how to exclude the first dup instance

any ideas?

my test code, run it here

$array = [
    ['name' => 'test1', 'number' => '9999'],
    ['name' => 'test2', 'number' => '9999'],
    ['name' => 'test3', 'number' => '5555'],
    ['name' => 'test4', 'number' => '6666'],
    ['name' => 'test5', 'number' => '6666'],
    ['name' => 'test6', 'number' => '6666'],
];

$second = [
    '9999' => 2,
    '5555' => 1,
    '6666' => 3,
];

foreach($array as $k => $data) {
    foreach($second as $num => $key) {
    if($key > 1) {
        if($data['number'] == $num) {
        $array[$k]['duplicate'] = 'yes';
        }
    }
    }
}

var_dump($array);

One way to do this would be to keep your own index in another array. If you store each number as a key in your new array, you can then increment that key every time you see it in your array. This should be a reasonably quick approach to this problem.

<?php
$array = [
  ['name' => 'test1', 'number' => '9999'],
  ['name' => 'test2', 'number' => '9999'],
  ['name' => 'test3', 'number' => '5555'],
  ['name' => 'test4', 'number' => '6666'],
  ['name' => 'test5', 'number' => '6666'],
  ['name' => 'test6', 'number' => '6666'],
];

$counterArray = array();
for($i = 0; $i < count($array); $i++) {
  $val = $array[$i]['number'];
  if(!array_key_exists($val, $counterArray)) {
    $counterArray[$val] = 0;
  } else {
    $counterArray[$val]++;
  }
}

print_r($counterArray);
?>

This is untested code, just to give you an idea of how to approach this.

Not tested, but...

$total = count($array);
$dupecount = array();
for($i = 0; $i < ($total - 1); $i++) {
   if ($array[$i]['status'] == 'duplicate') {
       continue; // this entry was already processed, so skip it.
   }
   for($j = $i + 1; $j < $total; $j++) {
      if ($array[$i] == $array[$j]) {
         $array[$j]['status'] = 'duplicate'; // update states
         $dupecount[$array[$i]['number']]++; // add to stats
      }
   }
}

Basically, outer loop processes all but the last entry of the array. As it processes, it checks if the current entry was already marked a duplicate. That'll prevent double-counting.

The inner loop processes everything AFTER wherever the outer loop is. Any dupes are marked and accounted for.

test this

   $array = array(

    array('name' => 'test1', 'number' => '9999'),
    array('name' => 'test2', 'number' => '9999'),
    array('name' => 'test3', 'number' => '5555'),
    array('name' => 'test4', 'number' => '6666'),
    array('name' => 'test5', 'number' => '6666'),
    array('name' => 'test6', 'number' => '6666'));


$second = array(
    '9999' => 2,
    '5555' => 1,
    '6666' => 3,
);

foreach($array as $k => $data) {
    if($second[$data["number"]] > 1) {
        $array[$k]["duplicate"] = "yes";
        $second[$data["number"]]--;
    }
}
var_dump($array);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM