简体   繁体   English

C++ RapidXml - 使用 first_node() 遍历以修改 XML 文件中节点的值

[英]C++ RapidXml - Traversing with first_node() to modify the value of a node in an XML file

I must be going crazy here... Here is my XML file, called Original.xml :我一定要疯了……这是我的 XML 文件,名为Original.xml

<root>
  <metadata>Trying to change this</metadata>
  <body>
    <salad>Greek Caesar</salad>
  </body>
</root>

I am trying to modify the contents within the metadata tag.我正在尝试修改metadata标签中的内容。

Here is my entire piece of code that I have that WORKS :这是我拥有WORKS的整个代码:

#include <iostream>

#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>

int main()
{
  // Open 'Original.xml' to read from
  rapidxml::file<> xmlFile("Original.xml");
  rapidxml::xml_document<> doc;
  doc.parse<0>(xmlFile.data());

  // Get to <metadata> tag
  //                                       <root>        <metadata>    ???
  rapidxml::xml_node<>* metadataNode = doc.first_node()->first_node()->first_node();
  
  // Always correctly prints: 'Trying to change this'
  std::cout << "Before: " << metadataNode->value() << std::endl;

  // Modify the contents within <metadata>
  const std::string newMetadataValue = "Did the changing";
  metadataNode->value(newMetadataValue.c_str());

  // Always correctly prints: 'Did the changing'
  std::cout << "After: " << metadataNode->value() << std::endl;

  // Save output to 'New.xml'
  std::ofstream newXmlFile("New.xml");
  newXmlFile << doc;
  newXmlFile.close();
  doc.clear();

  return 0;
}

New.xml will now look like this: New.xml现在看起来像这样:

<root>
    <metadata>Did the changing</metadata>
    <body>
        <salad>Greek Caesar</salad>
    </body>
</root>

That's the desired behavior I want.这就是我想要的行为。

What I don't understand is why I need a third first_node() call to SAVE the information inside metadata .我不明白为什么我需要第三次first_node()调用来保存metadata里面的信息。

If I remove the third first_node() call, which is marked by the ???如果我删除第三个first_node()调用,它由??? above, New.xml will keep the old <metadata> string: "Trying to change this" .上面, New.xml将保留旧的<metadata>字符串: “试图改变这个”

Yet, in this scenario, both std::cout calls on metadataNode->value() will still correctly print the intended strings;然而,在这种情况下,对metadataNode->value()的两个std::cout调用仍将正确打印预期的字符串; meaning, the first one will print "Trying to change this" and the second will correctly print "Did the changing" .意思是,第一个将打印"Trying to change this" ,第二个将正确打印"Did the changed"

Why in the world do I need to use n+1 calls to first_node() to SAVE the new value at the desired node where n is the number of nodes traversed (from the root) to get to the desired node?为什么我需要对first_node()使用n+1调用来在所需节点保存新值,其中n是遍历(从根)到达所需节点的节点数? Why is that if I have n first_node() calls, I can successfully modify the value at the desired node in RAM only?为什么如果我有n first_node()调用,我只能成功修改 RAM 中所需节点的值?

Possible bug?可能的错误? On whose end?在谁的一端?

In the XML tree model, text elements are nodes as well.在 XML 树 model 中,文本元素也是节点。 This makes sense when you have mixed content elements: <a>some<b/>text<c/>nodes</a> .当您有混合内容元素时,这是有道理的: <a>some<b/>text<c/>nodes</a>

Basically:基本上:

#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>

int main() {
    rapidxml::file<> xmlFile("Original.xml");
    rapidxml::xml_document<> doc;
    doc.parse<0>(xmlFile.data());

    auto root      = doc.first_node();
    auto metadata  = root->first_node();
    auto text_node = metadata->first_node();

    text_node->value("Did the changing");

    std::ofstream newXmlFile("New.xml");
    newXmlFile << doc;
}

But wait, there's more但是等等,还有更多

Sadly, this is a problem unless your input has exactly the expected properties.可悲的是,除非您的输入具有完全预期的属性,否则这是一个问题。

Assuming this ok sample:假设这个好的样本:

char sample1[] = R"(<root><metadata>Trying to change this</metadata></root>)";

If metadata element were empty, you'd crash:如果元数据元素为空,您将崩溃:

char sample2[] = R"(<root><metadata></metadata></root>)";
char sample3[] = R"(<root><metadata/></root>)";

Indeed this triggers ASAN failures:事实上,这会触发 ASAN 故障:

/home/sehe/Projects/stackoverflow/test.cpp:17:25: runtime error: member access within null pointer of type 'struct xml_node'
/home/sehe/Projects/stackoverflow/test.cpp:17:25: runtime error: member call on null pointer of type 'struct xml_base'
/usr/include/rapidxml/rapidxml.hpp:762:24: runtime error: member call on null pointer of type 'struct xml_base'
/usr/include/rapidxml/rapidxml.hpp:753:21: runtime error: member access within null pointer of type 'struct xml_base'
AddressSanitizer:DEADLYSIGNAL

If there's a surprise, it will.... do surprising things!如果有惊喜,它会……做令人惊讶的事情!

char sample4[] = R"(<root><metadata><surprise/></metadata></root>)";

Ends up erroneously generating:最终错误地生成:

<root>
        <metadata>
                <surprise>changed</surprise>
        </metadata>
</root>

And that's not the end of it:这还没有结束:

#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
#include <iostream>

namespace {
    char sample1[] = R"(<root><metadata>Trying to change this</metadata></root>)";
    char sample2[] = R"(<root><metadata><surprise/></metadata></root>)";
    char sample3[] = R"(<root><metadata>mixed<surprise/>bag</metadata></root>)";
    char sample4[] = R"(<root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root>)";
    char sample5[] = R"(<root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root>)";
    //These crash:
    //char sampleX[] = R"(<root><metadata></metadata></root>)";
    //char sampleY[] = R"(<root><metadata/></root>)";
}

int main() {
    for (char* xml : {sample1, sample2, sample3, sample4, sample5}) {
        std::cout << "\n=== " << xml << " ===\n";
        rapidxml::xml_document<> doc;
        doc.parse<0>(xml);

        auto root      = doc.first_node();
        auto metadata  = root->first_node();
        auto text_node = metadata->first_node();

        text_node->value("changed");
        print(std::cout << " --> ", doc, rapidxml::print_no_indenting);
        std::cout << "\n";
    }
}

Prints印刷

=== <root><metadata>Trying to change this</metadata></root> ===
 --> <root><metadata>changed</metadata></root>

=== <root><metadata><surprise/></metadata></root> ===
 --> <root><metadata><surprise>changed</surprise></metadata></root>

=== <root><metadata>mixed<surprise/>bag</metadata></root> ===
 --> <root><metadata>changed<surprise/>bag</metadata></root>

=== <root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root> ===
 --> <root><metadata><![CDATA[changed]]></metadata></root>

=== <root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root> ===
 --> <root><metadata>changed</metadata></root>

HOW TO GET IT ROBUST?如何让它变得强大?

  • Firstly, use queries to find your target.首先,使用查询来找到您的目标。 Sadly rapidxml doesn't support this;可悲的是 rapidxml 不支持这一点; See What XML parser should I use in C++?请参阅我应该在 C++ 中使用什么 XML 解析器?

  • Secondly, check the node type before editing其次,编辑前检查节点类型

  • Thirdly, replace the entire node if you can, that makes you independent of what was previously there第三,如果可以的话,替换整个节点,这样你就可以独立于以前的节点了

  • Lastly, be sure to actually allocate your new node from the document so you don't get lifetime issues.最后,请务必从文档中实际分配新节点,这样您就不会遇到生命周期问题。

     auto root = doc.first_node(); if (auto* old_meta = root->first_node()) { assert(old_meta->name() == std::string("metadata")); print(std::cout << "Removing metadata node: ", *old_meta, fmt); std::cout << "\n"; root->remove_first_node(); } auto newmeta = doc.allocate_node(rapidxml::node_element, "metadata", "changed"); root->prepend_node(newmeta);

PUTTING IT ALL TOGETHER:把它们放在一起:

#include <rapidxml/rapidxml.hpp>
#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
#include <iostream>

namespace {
    std::string cases[] = {
     R"(<root><metadata>Trying to change this</metadata></root>)",
     R"(<root><metadata><surprise/></metadata></root>)",
     R"(<root><metadata>mixed<surprise/>bag</metadata></root>)",
     R"(<root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root>)",
     R"(<root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root>)",
     R"(<root>
  <metadata>Trying to change this</metadata>
  <body>
    <salad>Greek Caesar</salad>
   </body>
</root>)",
     //These no longer crash:
     R"(<root><metadata></metadata></root>)",
     R"(<root><metadata/></root>)",
     // more edge-cases in the predecessor chain
     R"(<root></root>)",
     R"(<root><no-metadata/></root>)",
     R"(<bogus/>)",
    };
}

int main() {
    auto const fmt = rapidxml::print_no_indenting;
    for (auto& xml : cases) {
        std::cout << "Input: " << xml << "\n";

        rapidxml::xml_document<> doc;
        doc.parse<0>(xml.data());

        if (auto root = doc.first_node()) {
            if (root->name() == std::string("root")) {
                if (auto* old_meta = root->first_node()) {
                    if (old_meta->name() == std::string("metadata")) {
                        root->remove_first_node();
                    } else {
                        std::cout << "WARNING: Not removing '" << old_meta->name() << "' element where 'metadata' expected\n";
                    }
                }

                auto newmeta = doc.allocate_node(rapidxml::node_element, "metadata", "changed");
                root->prepend_node(newmeta);
            } else {
                std::cout << "WARNING: '" << root->name() << "' found where 'root' expected\n";
            }
        }
        
        print(std::cout << "Output: ", doc, fmt);
        std::cout << "\n--\n";
    }
}

Prints印刷

Input: <root><metadata>Trying to change this</metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><surprise/></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata>mixed<surprise/>bag</metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root>
  <metadata>Trying to change this</metadata>
  <body>
    <salad>Greek Caesar</salad>
   </body>
</root> ===
Output: <root><metadata>changed</metadata><body><salad>Greek Caesar</salad></body></root>
--
Input: <root><metadata></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata/></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><no-metadata/></root> ===
WARNING: Not removing 'no-metadata' element where 'metadata' expected
Output: <root><metadata>changed</metadata><no-metadata/></root>
--
Input: <bogus/> ===
WARNING: 'bogus' found where 'root' expected
Output: <bogus/>
--

SUMMARY概括

XML is extensible. XML 是可扩展的。 It's Markup.是标记。 It's Language.是语言。 It's not simple:)不简单:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM