简体   繁体   English

解析后如何从XML文件中删除元素?

[英]How do I delete an element from an XML file after it is parsed?

Suppose the XML file is: 假设XML文件是:

<class name=math>
<student>luke1</student>
...
<student>luke8000000</student>
</class>
<class name=english>
<student>mary1</student>
...
<student>mary1000000</student>
</class>

after class=math is parsed, I want to delete this element from the XML file so that when class=english is parsed, Twig will not go through the content of class=math . 在解析class=math之后,我想从XML文件中删除此元素,以便在解析class=english时,Twig将不会遍历class=math的内容。

The reason I want to do this is that, so far even if I use TwigRoots => {"class[\\@name='english']" => \\&counter} I still need to wait a long time for Twig to start to parse class=english because it needs to go over each line of class=math (correct me if it does not need to go over each line). 我想这样做的原因是,到目前为止,即使我使用TwigRoots => {"class[\\@name='english']" => \\&counter}我仍然需要等待很长时间才能让Twig开始解析class=english因为它需要遍历class=math每一行(如果不需要遍历每一行,请更正我)。 In the actual file I run, there are several classes, I do not want to let Twig go through each line in class=math before finding the class it is really interested in. 在我运行的实际文件中,有几个类,在找到它真正感兴趣的类之前,我不想让Twig遍历class=math每一行。

Thanks in advance. 提前致谢。

Could you use the ignore_elts option when you build the twig: 构建树枝时可以使用ignore_elts选项:

ignore_elts
   This option lets you ignore elements when building the twig. This is useful
   in cases where you cannot use "twig_roots" to ignore elements, for example 
   if the element to ignore is a sibling of elements you are interested in.

           Example:

             my $twig= XML::Twig->new( ignore_elts => { elt => 1 });
             $twig->parsefile( 'doc.xml');

  This will build the complete twig for the document, except that all "elt" 
  elements (and their children) will be left out.

In this case you could write XML::Twig->new( ignore_elts => { 'class[@name="math"]' => 1 }, ... to skip those elements 在这种情况下,您可以编写XML :: Twig-> new(ignore_elts ignore_elts => { 'class[@name="math"]' => 1 }, ...以跳过那些元素

Note that these elements will not be included in the tree but they will still be parsed. 请注意,这些元素将不包含在树中,但仍将被解析。 That speeds up things a bit, but not that much (how's that for quantitative data? ;--) In any case the whole file needs to be parsed. 这样可以加快速度,但是速度却不那么快(定量数据的速度如何?;-)在任何情况下都需要解析整个文件。

BTW the XML in your question is not well-formed, there should be quotes around attributes. 顺便说一句,问题中的XML格式不正确,应在属性周围加引号。

I havent used the delete functionality with TWIG but check this link This has some infor about deleteion of nodes using TWIG 我尚未将删除功能与TWIG一起使用,但是请检查此链接。这具有一些有关使用TWIG删除节点的信息

The relevant portion being here: 相关部分在这里:

\n    } }\n    else { 其他{\n        $para->delete; $ para->删除;\n    } }\n} }\n

The last part of the paragraph handler deletes the twig from the result tree if the paragraph didn't contain a match for the specified keyword. 如果该段落不包含与指定关键字匹配的内容,则该段落处理程序的最后一部分将从结果树中删除该树枝。 This ensures that only those paragraphs containing a match will make it into the final output. 这样可以确保只有包含匹配项的那些段落才能进入最终输出。

$para is an element passed into the handler. $para是传递给处理程序的元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM