简体   繁体   中英

Open two CSV files, compare and add them to an array

I am trying to take info from two different two CSV files and add them to an array. What I do basically it's to open the first file, take it's content into an array as string. Then here it comes the tricky part. Both files have an ID field in common, so whenever the ID matches it has to be taken into the array.

I've tried to do this in two different ways, opening one file and within this opening the other, compare and save to the array. The other way was getting all the info from both files to two separe arrays and then find the match and get them to a third array.

Here's the code:

$handle0 = \fopen("/Data/mountain1.csv", "r");

if ($handle0) {
    $line0 = 0;
    while (($buffer0 = fgets($handle0, 4000)) !== false) {
        if ($line0 > 0){
            $mountainArray = str_getcsv($buffer0, ",");                 
            $obj = array();
            $obj["id"] = $mountainArray[2];
            $obj["name"] = $mountainArray[0];
            $obj["country"] = $mountainArray[1];

            $handle1 = fopen("/Data/mountain1.csv", "r");
            if ($handle1) {
                $line1 = 0;
                while (($buffer1 = fgets($handle1, 4000)) !== false) {
                    if ($line1 > 0) {
                        $latlonArray = str_getcsv($buffer1, ",");
                        $content = array();
                        $content["id"] = $latlonArray[1];
                        if ((int)$content["id"] == (int)$obj["id"]) {
                            $obj["latitude"] = $latlonArray[7];
                            $obj["longitude"] = $latlonArray[8];
                        }
                    $line1++;
                    }
                }
                fclose($handle1);
            }

            $mountain[] = $obj;
        }
        $line0++;
    }
    fclose($handle0);
}

This code just loops and does nothing

if ($handle0) {
    while (($buffer0 = fgets($handle0, 4000)) !== false) {
        $mountainArray = str_getcsv($buffer0, ",");
        $content0 = array();
        $content0["id"] = $mountainArray[2];
        $content0["name"] = $mountainArray[0];
        $content0["country"] = $mountainArray[1];

        $mountain[] = $content0;
    }
    fclose($handle0);
}

if ($handle1) {
    while (($buffer1 = fgets($handle1, 4000)) !== false) {
        $latlonArray = str_getcsv($handle1, ",");
        $content1 = array();
        $content1["id"] = $latlonArray[1];
        $content1["latitude"] = $latlonArray[7];
        $content1["longitude"] = $latlonArray[8];

        $latlon[] = $content1;
    }
    fclose($handle1);
}

foreach ($mountain as $row0) {
    $obj = array();
    $obj["id"] = $row0["productUid"];
    $obj["name"] = $row0["name"];
    $obj["country"] = $row0["address"];

    foreach ($latlon as $row1) {
        if((int)$row1["id"] == (int)$row0["id"]) {
            $obj["latitude"] = $row1["latitude"];
            $obj["longitude"] = $row1["longitude"];
        }
    }

    $mountains[] = $obj;
}

and this one just returns null to me...

From your code I assumed that:

  • For the CSV file containing mountains data: id is at position 2, name at 0 and country at 1 .
  • For the CSV file with coordinates: id in 1, latitude in 7 and longitude in 8 .

I decided to give you a more thorough code snippet that'll work for any number of CSV files, you just have to add them to the $csvFiles array and use the file name as key and file type as value.

<?php
$result   = array();
$csvFiles = array(
    'mountains.csv'   => 'Mountain',
    'coordinates.csv' => 'Coordinate'
);

foreach ($csvFiles as $csvFile => $type) {
    if ($handle = fopen($csvFile, 'r')) {
        $lineNumber = 0;

        while ($data = fgetcsv($handle, 128, ',')) {
            if (!$lineNumber) {
                $lineNumber++;
                continue;
            }

            switch ($type) {
                // Store the record in the result array
                case 'Mountain':
                    $record = array(
                        'id'      => $data[2],
                        'name'    => $data[0],
                        'country' => $data[1]
                    );

                    $id          = $record['id'];
                    $result[$id] = $record;
                    break;

                // Add longitude and latitude to the record
                // if already in the result array
                case 'Coordinate':
                    $record = array(
                        'id'        => $data[1],
                        'latitude'  => $data[7],
                        'longitude' => $data[8]
                    );

                    $id = $record['id'];
                    if (!empty($result[$id])) {
                        $result[$id] = array_merge($result[$id], $record);
                    }
                    break;
            }
        }
    }
}

print_r($result);

With the following files:

mountains.csv

# CSV headers
aaa, USA, 1
aab, Canada, 2
aac, USA, 3
bbb, Portugal, 4
ccc, Germany, 5

coordinates.csv

# CSV headers
asd, 1, asd, asd, asd, asd, asd, 10.00, 20.00
asd, 2, asd, asd, asd, asd, asd, 1.00, 2.00
asd, 4, asd, asd, asd, asd, asd, 5.00, 10.00
asd, 3, asd, asd, asd, asd, asd, 2.00, 4.00
asd, 5, asd, asd, asd, asd, asd, 100.00, 200.00

the output will be:

Array
(
    [ 1] => Array
        (
            [id] =>  1
            [name] => aaa
            [country] =>  USA
            [latitude] =>  10.00
            [longitude] =>  20.00
        )

    [ 2] => Array
        (
            [id] =>  2
            [name] => aab
            [country] =>  Canada
            [latitude] =>  1.00
            [longitude] =>  2.00
        )

    [ 3] => Array
        (
            [id] =>  3
            [name] => aac
            [country] =>  USA
            [latitude] =>  2.00
            [longitude] =>  4.00
        )

    [ 4] => Array
        (
            [id] =>  4
            [name] => bbb
            [country] =>  Portugal
            [latitude] =>  5.00
            [longitude] =>  10.00
        )

    [ 5] => Array
        (
            [id] =>  5
            [name] => ccc
            [country] =>  Germany
            [latitude] =>  100.00
            [longitude] =>  200.00
        )
)

If I understand correctly you are trying to get the intersection between two csv files based on their id

What you are going to want to do to minimize your chance of running into memory problems is this.

Take and build an array of the id s in the file you want to you want to compare against. A simple fopen then call fgetcsv in a loop should work.

$ids = array();
$fp = fopen($file1, "r");
while ($row = fgetcsv($fp)){
    // assuming first field contains the id
    $ids[$row[0]] = "";
} 
fclose($fp);

Open your the file you want to compare and go through the same fopen, fgetcsv loop but check for the existence of each id in the list you built in step 1. Then if it is in the list add it to the results.

$results = array();
$fp = fopen($file2, "r");
while ($row = fgetcsv($fp)){
    if (isset($ids[$row[0])){
          $results[] = $row;
    }
}

This method avoids having to represent all of the data in either of the files as an array.

This may be a little more than needed, but It works for me.

csv1.csv

id,val
0,cat
1,dog

csv2.csv

id,val
2,brid
1,cat

The PHP

<?php
header("content-type: text/plain");
$array = [];
$i = 0;
$csv1 = "csv1.csv";
$csv2 = "csv2.csv";

// Load file 1 into an array
// Skip row 1
if (($handle = fopen($csv1, "r")) !== FALSE){
    while (($data = fgetcsv($handle)) !== FALSE){
        if($i == 0){$i++; continue;}
        $array[] = $data;
        $i++;
    }
    fclose($handle);
}

$i = 0;
// Load file 2 into the array if the values don't exist
// Skip row 1
if (($handle = fopen($csv2, "r")) !== FALSE){
    while (($data = fgetcsv($handle)) !== FALSE){
        if($i == 0){$i++; continue;}
        $inarray = false;
        foreach($array as $itm){
            if(in_array($data[0], $itm)){
                $inarray = true;
                break;
            }
        }
        if(!$inarray){
            $array[] = $data;
        }
        $i++;
    }
    fclose($handle);
}


print_r($array);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM