简体   繁体   中英

Parse a CSV with a JSON in it using PHP

Introduction

I have a CSV file, in which every field is enclosed with a double quote ( " ). The last field in every line is a JSON string representation. I want to write a PHP script that parses the CSV file, and subsequently parses the JSON string. This is what I have now.

while (($line = fgetcsv($handle, 1000000, ";", '"')) !== false)
{
    // Another loop to loop over the fields
    // ...
    parse_json(end($line));
}

private function parse_json($json_string)
{
    if (!empty($json_string))
    {
        $json = json_decode($json_string, true);
        $msg = sprintf("The following description is not in proper JSON format: %s", $json_string);
        if (is_null($json))
        {
            // The function json_decode returns null if the string is not properly JSON formatted.
            throw new Exception($msg);
        }
    }
}

With the following line in the CSV file, I get the following array in PHP.

"A";"B";"C";"D";"E";;"{""Name"":""Richard B Mephisto""}"
array ('Name' => 'Richard B Mephisto');

Problem description

The trouble starts when I want to allow for a double quote in one of the values of the JSON string. For JSON, I need to escape a double quote with a backslash, while for CSV, I need to escape a double quote with another double quote. How should the CSV file and the parser look like, if I want the following array?

array ('Name' => 'Richard "B" Mephisto');

Failed attempts

1) Use the following line in the CSV file.

"A";"B";"C";"D";"E";;"{""Name"":""""Richard B Mephisto""""}"

When parsing the JSON, before calling json_decode , replace every "" with a /" . This works in this case, but I also need to allow for empty strings.

"A";"B";"C";"D";"E";;"{""Name"":""}"

These will also be replaced with this solution.

2) Use backslashes in the CSV file. In principle, the JSON string should look like this:

{"Name": "Richard \"B\" Mephisto"}

So I try this in the CSV file:

"A";"B";"C";"D";"E";;"{""Name"":\""Richard B Mephisto\""}"

With as result:

The following description is not in proper JSON format: {"JSON_key":"Richard \\"B\\"" Mephisto""}"

Somehow, it did not work properly together with the escaping character and the double quotes.

3) Escape the backslash in the CSV.

"A";"B";"C";"D";"E";;"{""JSON_key"":""Richard \\""B\\"" Mephisto""}"

Result:

The following description is not in proper JSON format: {"JSON_key":"Richard \\"B\\" Mephisto"}

Try this:

$in = '"A";"B";"C";"D";"E";;"{""Name"":""Richard \""B\"" Mephisto""}";"{""Name"":""""}"';
$out = str_getcsv($in, ';', '"', '"'); 
var_dump($out);

Result:

array(8) {
  [0]=>
  string(1) "A"
  [1]=>
  string(1) "B"
  [2]=>
  string(1) "C"
  [3]=>
  string(1) "D"
  [4]=>
  string(1) "E"
  [5]=>
  string(0) ""
  [6]=>
  string(33) "{"Name":"Richard \"B\" Mephisto"}"
  [7]=>
  string(11) "{"Name":""}"
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM