简体   繁体   English

使用XML :: LibXML查找和替换文本

[英]Find and replace text using XML::LibXML

I want to find text enclosed by tilde ( ~ ) and prepend the text with some string, eg replace ~it~ with ~T1it~ in an XML file, then save the result to another file. 我想找到由tilde( ~ )括起来的文本,并在文本前加上一些字符串,例如在XML文件中用~T1it~替换~it~ ,然后将结果保存到另一个文件中。 I know how to get the text using XPath and how to replace it, but I don't know how to put the replaced text in their places and output it. 我知道如何使用XPath获取文本以及如何替换它,但我不知道如何将替换的文本放在他们的位置并输出它。

Here's my input XML: 这是我的输入XML:

<?xml version="1.0"?>
<chapter>
<section>
<para id="p001">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p002">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p003">this is<math>~rom~This is roman~normal~</math>para</para>
</section>
<abstract>
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para>
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para>
</abstract>
</chapter>

Here's my Perl script: 这是我的Perl脚本:

use strict;
use warnings;
use XML::LibXML;
#use XML::LibXML::Text;
use Cwd 'abs_path';
my $x_name=abs_path($ARGV[0]);
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1);
my $xpath_expression='/chapter/section/para/math';
my @nodes = $doc->findnodes( $xpath_expression );
foreach my $node(@nodes){
  my $content = $node->textContent;
  $content=~s#\~rom\~#~T1rom~#sg;
  print $content,"\n";
}

Here's my desired output: 这是我想要的输出:

<?xml version="1.0"?>
<chapter>
<section>
<para id="p001">this is<math>~T1rom~This is roman~normal~</math>para</para>
<para id="p002">this is<math>~T1rom~This is roman~normal~</math>para</para>
<para id="p003">this is<math>~T1rom~This is roman~normal~</math>para</para>
</section>
<abstract>
<para id="p004">This is <math>~rom~This is roman~normal~</math>para</para>
<para id="p005">this is<math>~rom~This is roman~normal~</math>para</para>
<para id="p006">this is<math>~rom~This is roman~normal~</math>para</para>
</abstract>
</chapter>

One possibility: use the setData method of the XML::LibXML::Text : 一种可能性:使用XML::LibXML::TextsetData方法:

#!/usr/bin/perl
use warnings;
use strict;

use XML::LibXML;    

my $x_name = $ARGV[0];
my $doc = XML::LibXML->load_xml(location => $x_name, no_blanks => 1);
my $xpath_expression = '/chapter/section/para/math/text()';
my @nodes = $doc->findnodes( $xpath_expression );
for my $node (@nodes) {
    my $content = $node->toString;
    $content =~ s#\~rom\~#~T1rom~#sg;
    $node->setData($content);
}
$doc->toFile($x_name . '.new', 1);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM