简体   繁体   中英

Opposite of xml_parse_into_struct?

I'm trying to write a series of functions that will extract the document.xml part of a MS Word DOCX file and effectively mail merge a series of key/value pairs to replace the defined template fields in the document. I have a function that uses xml_parse_into_struct to convert the XML text into the necessary arrays, but once I'm done with the replacing of text I'll (presumably) need to use the ZipArchive method addFromString to create the new document.xml file and add it to the DOCX zip container. But I'm not sure how to do that when I'm working with an array of data rather than an XML string. Is there a way to convert an array back into the XML string format?

Here's what I have so far:

// $filename = name of DOCX file to open
function get_docx_xml($filename) {
  // Extract XML from DOCX file
    $zip = new ZipArchive();
    if ($zip->open($filename, ZIPARCHIVE::CHECKCONS) !== TRUE) { echo 'failed to open template'; exit; }
    $xml = 'word/document.xml';
    $data = $zip->getFromName($xml);
    $zip->close();
    // Create the XML parser and create an array of the results
    $parser = xml_parser_create_ns();
    xml_parse_into_struct($parser, $data, $vals, $index);
    xml_parser_free($parser);
    // Return the relevant XML information
    return array('vals' => $vals, 'index' => $index);
}

That part works fine, I can print_r both arrays and make sense of the results. However, the following function does not work -- at least not in all cases. If I use certain delimiters for the fields to be replaced it works, but not all the time which I assume is an issue with Word's character encoding or other formatting.

// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in
function mailmerge($templateFile, $newFile, $row) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $xmldata = get_docx_xml($newFile);
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  foreach ($row as $key => $value) {
    $data = str_replace($key, xml_escape($value), $data);
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

So instead of using str_replace (which fails a lot of the time) I was planning on cycling the $vals array that I get from the first function, doing the replace there, and then saving the resulting array back to a string and, in turn, back into the DOCX zip container.

While I didn't find the answer to my question, I've solved the problem via a workaround. Effectively I used a series substr_replace calls to make the necessary updates. Here's my new and improved mail merge function if anyone else needs something like this:

// Merge data into a Word file (mailmerge or custom)
// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in; $delim_start = starting delimiter; $delim_end = ending delimiter
function mailmerge($templateFile, $newFile, $row, $delim_start, $delim_end) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  $currentpos = 0;
  foreach ($row as $key => $value) {
    // Look for a naturally occuring instance of the replacement string (key) and replace as needed
    if (stristr($data, $key)) {
      $currentpos = strpos($data, $key) + strlen($key);
      $data = str_replace($key, xml_escape($value), $data);
    }
    else { // Look for the key's delimiter
      if (stristr($data, $delim_start, $currentpos)) {
        $pos_start = strpos($data, $delim_start, $currentpos);
        // Clear the initial delimiter
        $data = substr_replace($data, '', $pos_start, strlen($delim_start));
        // Now find the actual data (by XML key)
        $datapos_start = (strpos($data, '<w:t>', $pos_start)) + 5;
        $datapos_end = strpos($data, '</w:t>', $datapos_start);
        // Replace the data
        $data = substr_replace($data, xml_escape($value), $datapos_start, ($datapos_end - $datapos_start));
        // Clear the closing delimiter (have to recalculate datapos_end due to the replacement)
        $datapos_end = strpos($data, $delim_end, $datapos_start);
        $data = substr_replace($data, '', $datapos_end, strlen($delim_end));
        // Reset the current posistion variable for the next iteration
        $currentpos = $datapos_end + 6;
      }
    }
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM