简体   繁体   English

与xml_parse_into_struct相反?

[英]Opposite of xml_parse_into_struct?

I'm trying to write a series of functions that will extract the document.xml part of a MS Word DOCX file and effectively mail merge a series of key/value pairs to replace the defined template fields in the document. 我正在尝试编写一系列函数,这些函数将提取MS Word DOCX文件的document.xml部分,并有效地通过邮件合并一系列键/值对,以替换文档中定义的模板字段。 I have a function that uses xml_parse_into_struct to convert the XML text into the necessary arrays, but once I'm done with the replacing of text I'll (presumably) need to use the ZipArchive method addFromString to create the new document.xml file and add it to the DOCX zip container. 我有一个使用xml_parse_into_struct将XML文本转换为必要数组的函数,但是一旦替换完文本,我(大概)就需要使用ZipArchive方法addFromString来创建新的document.xml文件,并将其添加到DOCX zip容器中。 But I'm not sure how to do that when I'm working with an array of data rather than an XML string. 但是我不确定在处理数据数组而不是XML字符串时该怎么做。 Is there a way to convert an array back into the XML string format? 有没有一种方法可以将数组转换回XML字符串格式?

Here's what I have so far: 这是我到目前为止的内容:

// $filename = name of DOCX file to open
function get_docx_xml($filename) {
  // Extract XML from DOCX file
    $zip = new ZipArchive();
    if ($zip->open($filename, ZIPARCHIVE::CHECKCONS) !== TRUE) { echo 'failed to open template'; exit; }
    $xml = 'word/document.xml';
    $data = $zip->getFromName($xml);
    $zip->close();
    // Create the XML parser and create an array of the results
    $parser = xml_parser_create_ns();
    xml_parse_into_struct($parser, $data, $vals, $index);
    xml_parser_free($parser);
    // Return the relevant XML information
    return array('vals' => $vals, 'index' => $index);
}

That part works fine, I can print_r both arrays and make sense of the results. 那部分工作正常,我可以同时对两个数组进行print_r并理解结果。 However, the following function does not work -- at least not in all cases. 但是,以下功能不起作用-至少并非在所有情况下都有效。 If I use certain delimiters for the fields to be replaced it works, but not all the time which I assume is an issue with Word's character encoding or other formatting. 如果我对要替换的字段使用某些定界符,则它可以工作,但并不是我一直都认为Word的字符编码或其他格式有问题。

// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in
function mailmerge($templateFile, $newFile, $row) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $xmldata = get_docx_xml($newFile);
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  foreach ($row as $key => $value) {
    $data = str_replace($key, xml_escape($value), $data);
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

So instead of using str_replace (which fails a lot of the time) I was planning on cycling the $vals array that I get from the first function, doing the replace there, and then saving the resulting array back to a string and, in turn, back into the DOCX zip container. 因此,我没有使用str_replace(很多时候失败),而是计划循环从第一个函数获得的$ vals数组,在那里进行替换,然后将结果数组保存回字符串,然后依次,返回到DOCX zip容器。

While I didn't find the answer to my question, I've solved the problem via a workaround. 虽然我没有找到问题的答案,但已经通过解决方法解决了问题。 Effectively I used a series substr_replace calls to make the necessary updates. 实际上,我使用了一系列substr_replace调用来进行必要的更新。 Here's my new and improved mail merge function if anyone else needs something like this: 如果有人需要这样的功能,这是我的新功能和改进的邮件合并功能:

// Merge data into a Word file (mailmerge or custom)
// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in; $delim_start = starting delimiter; $delim_end = ending delimiter
function mailmerge($templateFile, $newFile, $row, $delim_start, $delim_end) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  $currentpos = 0;
  foreach ($row as $key => $value) {
    // Look for a naturally occuring instance of the replacement string (key) and replace as needed
    if (stristr($data, $key)) {
      $currentpos = strpos($data, $key) + strlen($key);
      $data = str_replace($key, xml_escape($value), $data);
    }
    else { // Look for the key's delimiter
      if (stristr($data, $delim_start, $currentpos)) {
        $pos_start = strpos($data, $delim_start, $currentpos);
        // Clear the initial delimiter
        $data = substr_replace($data, '', $pos_start, strlen($delim_start));
        // Now find the actual data (by XML key)
        $datapos_start = (strpos($data, '<w:t>', $pos_start)) + 5;
        $datapos_end = strpos($data, '</w:t>', $datapos_start);
        // Replace the data
        $data = substr_replace($data, xml_escape($value), $datapos_start, ($datapos_end - $datapos_start));
        // Clear the closing delimiter (have to recalculate datapos_end due to the replacement)
        $datapos_end = strpos($data, $delim_end, $datapos_start);
        $data = substr_replace($data, '', $datapos_end, strlen($delim_end));
        // Reset the current posistion variable for the next iteration
        $currentpos = $datapos_end + 6;
      }
    }
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM