简体   繁体   English

仅在使用 RegExp 抓取数据时显示值

[英]Only showing values when scraping data using RegExp

I am trying to scrape data from the following website: Morningstar ( https://www.morningstar.nl/nl/funds/snapshot/snapshot.aspx?id=F00000QIPC )我正在尝试从以下网站抓取数据: Morningstar ( https://www.morningstar.nl/nl/funds/snapshot/snapshot.aspx?id=F00000QIPC )

I want to scrape the EUR 20,66 , but only display the '20,66'.我想刮掉EUR 20,66 ,但只显示'20,66'。 I Use the following code:我使用以下代码:

function import1() {
  var  html, content = '';
  var response = UrlFetchApp.fetch("https://www.morningstar.nl/nl/funds/snapshot/snapshot.aspx?id=F00000QIPC"); 


  if (response) {
    html = response.getContentText();
    if (html) content = html.match(/<td class="line text">(.*?)<\/td>/)[1];
    Logger.log(content)
  }                                
   return content;

}

Displayed value is:显示值为:

EUR 20,66

I tried to add \d to only show values:我尝试添加\d以仅显示值:

if (html) content = html.match(/<td class="line text">(\d.*?)<\/td>/)[1];

but somehow it displays a different value: 1,09%但不知何故,它显示了不同的值: 1,09%

it doesn't seem to recognize the 20,66 as a value or something.它似乎没有将20,66视为一个值或其他东西。 I have tried different things where it would display EUR20, but never found something to remove the EUR (I cannot replicate this scenario again sadly)我尝试了不同的东西,它会显示EUR20,但从来没有找到可以删除EUR的东西(遗憾的是,我无法再次复制这种情况)

Any help solving this issue would be greatly appreciated!任何解决此问题的帮助将不胜感激!

It seems like you want to match EUR and then a number after a variable amount of whitespace.似乎您想要匹配EUR ,然后在可变数量的空格之后匹配一个数字。

You may use您可以使用

\s*EUR\s+(\d[\d,]*)

instead of .* .而不是.*

Details细节

  • \s* - 0 or more whitespaces \s* - 0 个或更多空格
  • EUR - a literal text EUR - 文字文本
  • \s+ - 1+ whitespaces \s+ - 1+ 个空格
  • (\d[\d,]*) - Capturing group 1: a digit followed with 0 or more commas or digits. (\d[\d,]*) - 捕获组 1:一个数字后跟 0 个或多个逗号或数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM