简体   繁体   English

XML解析谷歌应用程序脚本

[英]XML parsing google app script

I have a problem with function XmlService.parse in Google App Script. 我在Google App Script中遇到了函数XmlService.parse的问题。 I am trying to create script, and I need to parse emails which I have in inbox. 我正在尝试创建脚本,我需要解析收件箱中的电子邮件。 I tried to send several tests email which have this format 我试图发送几个具有此格式的测试电子邮件

<div dir="ltr">test 1<div><br></div></div>

but if I use this line 但如果我用这条线

var doc = XmlService.parse(messages[j].getBody());

I get this error 我收到这个错误

Error on line 1: The element type "br" must be terminated by the matching end-tag "". 第1行出错:元素类型“br”必须由匹配的结束标记“”终止。 (line 18, file "Code") (第18行,文件“代码”)

What is recognizably beacuse there is only 什么是可识别的,因为只有
in message. 在消息中。 Is there any solution how to solve this problem? 有没有解决方案如何解决这个问题? Or I have to use another way how to parse it? 或者我必须用另一种方式解析它? Thank you in advance. 先感谢您。

edit: I have the same problem with img tag 编辑:我对img标签有同样的问题

Error Occured: Error on line 38: The element type "img" must be terminated by the matching end-tag "". 错误发生:第38行出错:元素类型“img”必须由匹配的结束标记“”终止。

I need to parse the text which is in the red frame email to parse 我需要解析红框电子邮件中的文本进行解析

In old script there was a function 在旧脚本中有一个功能

Xml.parse(messag.getBody(),true)

however this function is deprecated. 但是这个功能已被弃用。 I tried to use 我试着用

XmlService.parse(messages.getBody());

which I mentioned but I get errors with unpaired html tags. 我提到但我得到了不成对的HTML标签错误。 The message which I get by function .getBody() is here getbody email 我通过函数.getBody()得到的消息是getbody email

Could someone help me? 有人能帮助我吗? Thanks once more. 再次感谢。

XmlService can not parse HTML. XmlService无法解析HTML。 It can only parse Canonical XML. 它只能解析Canonical XML。 But there are html parsing libraries for node JS. 但是有节点JS的html解析库。 So you can take one of those modules run it through browserify, make a minor modification to the generated source, and get a Apps Script library that parses html. 因此,您可以使用其中一个模块通过browserify运行它,对生成的源进行一些小修改,并获得一个解析html的Apps脚本库。

https://github.com/fb55/htmlparser2 https://github.com/fb55/htmlparser2

My generated library: 我生成的库:

1TLbGgQBCztnB0lOhcTYKg2UpXtpdDwocvfcx44w1tqFnHDJC5ZXy_BDo 1TLbGgQBCztnB0lOhcTYKg2UpXtpdDwocvfcx44w1tqFnHDJC5ZXy_BDo
https://github.com/Spencer-Easton/Apps-Script-htmlparser2-library https://github.com/Spencer-Easton/Apps-Script-htmlparser2-library

Example code modified from htmlparser2 readme: 从htmlparser2自述文件修改的示例代码:

function myFunction() {   
  var htmlparser = htmlparser2.init();
  var parser = new htmlparser.Parser({
    onopentag: function(name, attribs){
      if(name === "div"){
        Logger.log("found div");
      }
    },
    ontext: function(text){
      Logger.log("-->" + text);
    },
    onclosetag: function(tagname){
      if(tagname === "div"){
        Logger.log("End Div");
      }
    }
  }, {decodeEntities: true});
  parser.write('<div dir="ltr">test 1<div><br></div></div>');
  parser.end();  
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM