简体   繁体   English

PHP-数组为“唯一”

[英]PHP - array to 'unique'

I have an array that looks like the following: 我有一个看起来像下面的数组:

array(43197) {
 [0]=> array(4) {
     ["id"]=> string(5) "10038"
     ["country"]=> string(7) "Andorra"
     ["city"]=> string(16) "Andorra la Vella"
     ["name"]=> string(25) "Andorra la Vella Heliport" 
    }
 [1]=> array(4) {
     ["id"]=> string(5) "10040"
     ["country"]=> string(20) "United Arab Emirates"
     ["city"]=> string(17) "Abu Dhabi Emirate"
     ["name"]=> string(11) "Ras Sumeira" 
    }
 [2]=> array(4) {
     ["id"]=> string(5) "10041"
     ["country"]=> string(20) "United Arab Emirates"
     ["city"]=> string(13) "Dubai Emirate"
     ["name"]=> string(27) "Burj al Arab Resort Helipad" 
    }
 [3]=> array(4) {
     ["id"]=> string(5) "10042"
     ["country"]=> string(20) "United Arab Emirates"
     ["city"]=> string(13) "Dubai Emirate"
     ["name"]=> string(13) "Dubai Skydive" 
    }
 [4]=> array(4) {
     ["id"]=> string(5) "14243"
     ["country"]=> string(20) "United Arab Emirates"
     ["city"]=> string(13) "Dubai Emirate"
     ["name"]=> string(15) "Dubai Creek SPB" 
    }
 [5]=> array(4) {
     ["id"]=> string(5) "29266"
     ["country"]=> string(20) "United Arab Emirates"
     ["city"]=> string(17) "Abu Dhabi Emirate"
     ["name"]=> string(18) "Yas Island Airport" 
    }
...
}

Now I want to make this array 'unique' (to be able to create some select boxes later). 现在,我想使此数组“唯一”(以便以后可以创建一些选择框)。 I already have a function that works as expected... unfortunately it takes hours to complete with a very big array :( 我已经有一个可以按预期工作的函数...不幸的是,要花很长时间才能完成一个非常大的数组:(

Any ideas how to make this function faster? 任何想法如何使此功能更快?

function array_to_unique(//This function returns an array of unique values by given array
    //Version: 2.0.0.0
    $array,
    $uniqueCol)
    {
    $returnArray = array();

    $count = count($array);
    echo '<br>array count previous unique is: ' .$count;

    //Do the if(isset($uniqueCol)) just once - this is more code but faster with long arrays    
    if(isset($uniqueCol))
        {
        $helparray = array();
        foreach($array as $row)
            {
            if(!(in_array($row[$uniqueCol],$helparray)))
                {
                $helparray[] = $row[$uniqueCol];
                $returnArray[] = $row;
                }
            }
        }
    else{
        foreach($array as $row)
            {
            if(!(in_array($row,$returnArray)))
                {$returnArray[] = $row;}
            }
        }

    $count = count($returnArray);
    echo '<br>array count after unique is: ' .$count;

    return $returnArray;
    }

And this is how I call the function for example: 这就是我调用该函数的方式,例如:

array_to_unique($array); //This is okay
array_to_unique($array,'country'); //This is very very slow

Thank you in advance 先感谢您

in_array has a complexity of O(n) , meaning it has to go through all elements. in_array的复杂度为O(n) ,这意味着它必须遍历所有元素。 This is making your code slow. 这使您的代码变慢。

You can optimize the lookup if a value already exists, making use of a hash map instead of searching an array value. 如果值已经存在,则可以优化查找,可以使用哈希映射而不是搜索数组值。 Fortunately, associative arrays in PHP are implemented like that, so we can use the value as key and make a lookup with array_key_exists . 幸运的是,PHP中的关联数组是这样实现的,因此我们可以将值用作键并使用array_key_exists进行查找。

instead of: 代替:

    $helparray = array();
    foreach($array as $row)
        {
        if(!(in_array($row[$uniqueCol],$helparray)))
            {
            $helparray[] = $row[$uniqueCol];
            $returnArray[] = $row;
            }
        }

take: 采取:

    $helparray = array();
    foreach($array as $row)
        {
        if(!(array_key_exists($row[$uniqueCol], $helparray)))
            {
            $helparray[$row[$uniqueCol]] = true;
            $returnArray[] = $row;
            }
        }

Another thing to consider with very large arrays is that you copy the values to a new array, which increases memory footprint. 对于大型数组,要考虑的另一件事是将值复制到新数组,这会增加内存占用量。 If it is not important for you to that the keys of the return array are 0-indexed continuous integers, you could remove duplicate items from the original array: 如果返回数组的键是0索引的连续整数对您来说并不重要,则可以从原始数组中删除重复的项:

    $helparray = array();
    foreach($array as $key => $row)
        {
        if(!(array_key_exists($row[$uniqueCol], $helparray)))
            {
            $helparray[$row[$uniqueCol]] = true;
            }
            else
            {
            unset($array[$key]);
            }
        }
$deduplicated = [];
foreach ($array as $value) {
    $deduplicated[$value['country']] = $value;
}

Simply use the fact that keys are unique and you're automagically deduplicating your array in a single pass. 只需使用键是唯一的事实,就可以在一次通过中自动对数组进行重复数据删除。 If you don't like the new keys, use array_values() afterwards. 如果您不喜欢新键,请在以后使用array_values()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM