简体   繁体   中英

Generate unique ids for all column values in a multi-dimensional array

I have a file which looks like this:

Papiers peints > 3D et Perspective > 3D
Papiers peints > Carte du monde
Papiers peints > Fleurs
Papiers peints > Fleurs > Coquelicots
Tableaux > Cartes du monde
Tableaux > Fleurs
Tableaux > Fleurs > Coquelicots

and which I then transform into a properly formatted csv like this:

"Papiers peints","3D et Perspective","3D"
"Papiers peints","Carte du monde",
"Papiers peints","Fleurs",
"Papiers peints","Fleurs","Coquelicots"
"Tableaux","Cartes du monde",
"Tableaux","Fleurs",
"Tableaux","Fleurs","Coquelicots"

What I need is for each of these fields to have its own unique ID, which has to be an integer. This is how it should look like

"Papiers peints",101,"3D et Perspective",1001,"3D",10001
"Papiers peints",101,"Carte du monde",1002,,
"Papiers peints",101,"Fleurs",1003,,
"Papiers peints",101,"Fleurs",1003,"Coquelicots",10002
"Tableaux",102,"Cartes du monde",1004,,
"Tableaux",102,"Fleurs",1005,,
"Tableaux",102,"Fleurs",1005,"Coquelicots",10003

The names themselves don't matter at all and there will always be duplicates. I can solve this easily by uploading to a database. Then I do:

  • select distinct COL1 , give them their respective IDs
  • select COL2, group by COL1, COL2 , give them their respective IDs
  • select COL3, group by COL1, COL2,COL3 , give them their respective IDs
  • repeat for as many times as needed, which can be quite a lot

How do I do this in PHP without having to use a database? A straight answer would be nice, but even a design idea would help a lot.

There could be up to 10 columns in my file, but here is a simplified input array to work with:

$new=[0=>['a0','a1','a2','a3'],1=>['b0','b1','b2','b3'],2=>['c0','c1','c2','c3'],3=>['d0','d1','d2','d3'],4=>['e0','e1','e2','e3']];

Expected Result:

[
    ['a0','101','a1','1001','a2','10001','a3','100001'],
    ['b0','102','b1','1002','b2','10002','b3','100002'],
    ['c0','103','c1','1003','c2','10003','c3','100003'],
    ['d0','104','d1','1004','d2','10004','d3','100004'],
    ['e0','105','e1','1005','e2','10005','e3','100005']
]

If this were my project, I'd probably look into preparing the necessary values while pulling the data from the file. I'll use the $new array as input (I've modified a few of the values while testing).

Code ( Demo ):

$new=[0=>['a0','a1','a2','a3'],1=>['a0','b1','b2','b3'],2=>['c0','b1','c2','c3'],3=>['d0','d1','c2','d3'],4=>['e0','e1','e2','f3']];

// OP says could be up to 10 columns...
for($x=0; $x<10; ++$x){
    $index_base=pow(10,2+$x)+1;         // or replace pow() with ** if php >=5.6
    // echo $x," : ",$index_base,"\n";  // uncomment if you want to see the values created
    $unique_col_val_keys[$x]=
        array_map(
            function(&$v)use($index_base){  // &$v means modify the input array's values
                return $v+$index_base;    // update the incremented ids (values) using $index_base
            },
            // the following 4 functions generate the input array to map
            array_flip(                   // swap keys and values within the array
                array_values(             // reset keys
                    array_unique(         // remove duplicate column values
                        array_column(
                            $new,$x       // get set of values from each column of multi-dim array
                        )
                    )
                )
            )
        );
}
//var_export($unique_col_val_keys);  // uncomment if you want to see the generated indexes

foreach($new as $row=>$a){
    if(!isset($result[$row])){$result[$row]=[];}
    foreach($a as $col=>$v){
        // echo "$row $col $v : {$unique_col_val_keys[$col][$v]}\n";
        array_push($result[$row],$v,"{$unique_col_val_keys[$col][$v]}");
    }
}
var_export($result);

Output:

array (
  0 => 
  array (
    0 => 'a0',
    1 => '101',
    2 => 'a1',
    3 => '1001',
    4 => 'a2',
    5 => '10001',
    6 => 'a3',
    7 => '100001',
  ),
  1 => 
  array (
    0 => 'a0',
    1 => '101',
    2 => 'b1',
    3 => '1002',
    4 => 'b2',
    5 => '10002',
    6 => 'b3',
    7 => '100002',
  ),
  2 => 
  array (
    0 => 'c0',
    1 => '102',
    2 => 'b1',
    3 => '1002',
    4 => 'c2',
    5 => '10003',
    6 => 'c3',
    7 => '100003',
  ),
  3 => 
  array (
    0 => 'd0',
    1 => '103',
    2 => 'd1',
    3 => '1003',
    4 => 'c2',
    5 => '10003',
    6 => 'd3',
    7 => '100004',
  ),
  4 => 
  array (
    0 => 'e0',
    1 => '104',
    2 => 'e1',
    3 => '1004',
    4 => 'e2',
    5 => '10004',
    6 => 'f3',
    7 => '100005',
  ),
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM