简体   繁体   English

XmlService和importxml之间的区别

[英]Difference between XmlService and importxml

When trying to parse html as xml in google apps script, this code: 在谷歌应用脚​​本中尝试将html解析为xml时,此代码:

var yahoo= 'http://finance.yahoo.com/q?s=aapl'
var xml = UrlFetchApp.fetch(yahoo).getContentText(); 
var document = XmlService.parse(xml);

will return an error like this: 将返回如下错误:

Error on line 20: The entity name must immediately follow the '&' in the entity reference. 第20行出错:实体名称必须紧跟实体参考中的“&”。 (line 13, file "") (第13行,文件“”)

Presumably because the html is not xml-compliant in some way in line 20. What surprises me is that when you do the same thing in google sheets and also supply an xpath, the html will be parsed as xml without problems: 大概是因为html在第20行中不是xml兼容的。令我惊讶的是,当你在google工作表中执行相同的操作并提供xpath时,html将被解析为xml而不会出现问题:

=IMPORTXML("http://finance.yahoo.com/q?s=aapl,"//div[@class='title']")

will return "Apple Inc. (AAPL)". 将返回“Apple Inc.(AAPL)”。 I assume that the sheets function has some way of cleaning the html to make it xml compliant. 我假设sheet函数有一些方法来清理html以使其符合xml。

  • do you think that could be the case? 你认为可能是这种情况吗?
  • if yes, do you have an idea how I could adapt the xml parser in apps script in such a way that I can access html from yahoo finance and treat it as xml? 如果是的话,你是否知道如何在应用程序脚本中调整xml解析器,以便我可以从yahoo finance访问html并将其视为xml?

thanks in advance! 提前致谢!

New XmlService could not do lenient parse. 新的XmlService无法进行宽松解析。 So no way right now. 所以现在没办法。 But you can still use old Xml service that is support lenient parse (perhaps IMPORTXML use it as well). 但是你仍然可以使用支持lenient parse的旧Xml服务(也许IMPORTXML使用它)。 The code that works: 有效的代码:

var yahoo= 'http://finance.yahoo.com/q?s=aapl'
var xml = UrlFetchApp.fetch(yahoo).getContentText(); 
var document = Xml.parse(xml, true);

And there is the issue report about no ability to lenient parse in the new XmlService : https://code.google.com/p/google-apps-script-issues/issues/detail?id=3727 还有一个问题报告,关于在新的XmlService解析: https//code.google.com/p/google-apps-script-issues/issues/detail? XmlService = XmlService

So I propose you to use old way and keep an eye on this issue. 所以我建议你用旧的方式来关注这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM