简体   繁体   English

如何使用XML :: LibXML使用SAX解析XML?

[英]How do I use XML::LibXML to parse XML using SAX?

The only example code I have found so far is so old it won't work anymore (uses deprecated classes). 到目前为止,我发现的唯一示例代码已经很久了,它将不再起作用(使用已弃用的类)。 All I need is something basic that demonstrates: 我所需要的只是一些基本的东西:

  1. Loading and parsing the XML from a file 从文件加载和解析XML

  2. Defining the SAX event handler(s) 定义SAX事件处理程序

  3. Reading the attributes or text values of the element passed to the event handler 读取传递给事件处理程序的元素的属性或文本值

How about the distribution itself ? 发行本身怎么样?

Go to XML::LibXML distribution page and click browse . 转到XML :: LibXML分发页面 ,然后单击“ 浏览”

Note the following caution in the documentation : 请注意文档中的以下注意事项:

At the moment XML::LibXML provides only an incomplete interface to libxml2's native SAX implementation. 目前,XML :: LibXML仅为libxml2的本机SAX实现提供了不完整的接口。 The current implementation is not tested in production environment. 当前的实现未在生产环境中进行测试。 It may causes significant memory problems or shows wrong behaviour. 它可能会导致严重的内存问题或显示错误的行为。

There is also XML::SAX which comes with nice documentation . 还有XML :: SAX ,附带了很好的文档 I used it a few times and worked well for my purposes. 我曾经使用过几次并且很适合我的目的。

Sinan's suggestion was good, but it didn't connect all the dots. 思南的建议很好,但没有连接所有的点。 Here is a very simple program that I cobbled together: 这是一个非常简单的程序,我拼凑在一起:

file 1: The handlers (MySAXHandler.pm) 文件1:处理程序(MySAXHandler.pm)

  package MySAXHandler;
  use base qw(XML::SAX::Base);

  sub start_document {
    my ($self, $doc) = @_;
    # process document start event
  }

  sub start_element {
    my ($self, $el) = @_;
    # process element start event
    print "Element: " . $el->{LocalName} . "\n";
  }

1;

file 2: The test program (test.pl) 文件2:测试程序(test.pl)

#!/usr/bin/perl

use strict;
use XML::SAX;
use MySAXHandler;

my $parser = XML::SAX::ParserFactory->parser(
        Handler => MySAXHandler->new
);

$parser->parse_uri("some-xml-file.xml");

Note: How to get the values of an element attribute. 注意:如何获取元素属性的值。 This was not described in a way that I could use. 这没有以我可以使用的方式描述。 It took me over an hour to figure out the syntax. 我用了一个多小时来弄清楚语法。 Here it is. 这里是。 In my XML file, the attribute was ss:Index. 在我的XML文件中,属性是ss:Index。 The namespace definition for ss was xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet". ss的命名空间定义是xmlns:ss =“urn:schemas-microsoft-com:office:spreadsheet”。 Thus, in order to get the silly Index attribute, I needed this: 因此,为了获得愚蠢的Index属性,我需要这个:

my $ssIndex = $el->{Attributes}{'{urn:schemas-microsoft-com:office:spreadsheet}Index'}{Value};

That was painful. 那很痛苦。

XML :: LibXML :: Sax实现了Perl SAX接口,并且有一个很好的文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM