简体   繁体   English

Google Apps Script 停止从 Yahoo Finance 抓取数据

[英]Google Apps Script stopped scraping data from Yahoo Finance

This Google Apps Script code to scrape historical data from from Yahoo Finance stopped working yesterday.这个用于从 Yahoo Finance 抓取历史数据的 Google Apps 脚本代码昨天停止工作。 It suddenly gives the error - No data (data.length == 0).它突然给出错误 - 没有数据(data.length == 0)。

I think the bug is in the line 8 script while getting the JSON but I dont't have the necessary skill to fix it.我认为在获取 JSON 时,错误出现在第 8 行脚本中,但我没有必要的技能来修复它。

It woud be appreciate your help with issue.感谢您对问题的帮助。

function Scrapeyahoo(symbol) {
  //Leemos de yahoo finance historical data
  const s = encodeURI(symbol); // so that it works with a url
  // turn it into an URL and call it
  const url = 'https://finance.yahoo.com/quote/' +s +'/history?p=' +s;
  const res = UrlFetchApp.fetch(url, { muteHttpExceptions: true }).getContentText();
  const $ = Cheerio.load(res);
  const data = $('script').toArray().reduce((ar, x) => {
    const c = $(x).get()[0].children;
    if (c.length > 0) {
      const d = c[0].data.trim().match(/({"context"[\s\S\w]+);\n}\(this\)\);/);
      if (d && d.length == 2) {
        ar.push(JSON.parse(d[1]));
      }
    }
    return ar;
  }, []);
  if (data.length == 0) throw new Error("No data.");
  const header = ["date", "open", "high", "low", "close", "adjclose", "volume"];
  
  var key = Object.entries(data[0]).find(([k]) => !["context", "plugins"].includes(k))[1];
  if (!key) return;
  const cdnjs = "https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js";
  eval(UrlFetchApp.fetch(cdnjs).getContentText());
  const obj1 = data[0];
  const obj2 = JSON.parse(CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key)));
  const ar = obj2.HistoricalPriceStore.prices.map(o => header.map(h => h == "date" ? new Date(o[h] * 1000) : (o[h] || "")));
  // ---

  return ar
}

The original code was modified in December according to this solution , after it stopped working, but I can't find a solution for the issue now.原来的代码是在12月根据这个解决方案修改的,在它停止工作之后,但我现在找不到解决这个问题的方法。

It seems that the specification for retrieving the key has been changed.似乎检索密钥的规范已更改。 In this case, var key = Object.entries(data[0]).find(([k]) =>,["context". "plugins"];includes(k))[1];在这种情况下, var key = Object.entries(data[0]).find(([k]) =>,["context". "plugins"];includes(k))[1]; doesn't return the correct key.不返回正确的密钥。 By this, an error occurs at CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key)) .这样,在CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key))发生错误。

In the current stage, when I saw this script , the modified script is as follows.现阶段看到这个脚本,修改后的脚本如下。

Modified script:修改脚本:

function Scrapeyahoo(symbol) {
  const s = encodeURI(symbol);
  const url = 'https://finance.yahoo.com/quote/' +s +'/history?p=' +s;

  var html = UrlFetchApp.fetch(url).getContentText().match(/root.App.main = ([\s\S\w]+?);\n/);
  if (!html || html.length == 1) return;
  var obj = JSON.parse(html[1].trim());
  var key = [...new Map(Object.entries(obj).filter(([k]) => !["context", "plugins"].includes(k)).splice(-4)).values()].join("");
  if (!key) return;
  const cdnjs = "https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js";
  eval(UrlFetchApp.fetch(cdnjs).getContentText());
  const obj1 = JSON.parse(CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj.context.dispatcher.stores, key)));
  const header = ["date", "open", "high", "low", "close", "adjclose", "volume"];
  const ar = obj1.HistoricalPriceStore.prices.map(o => header.map(h => h == "date" ? new Date(o[h] * 1000) : (o[h] || "")));
  return ar
}
  • In this case, Cheerio is not used.在这种情况下,不使用Cheerio

Note:笔记:

  • In this sample, in order to load crypto-js , eval(UrlFetchApp.fetch(cdnjs).getContentText()) is used.在此示例中,为了加载crypto-js ,使用了eval(UrlFetchApp.fetch(cdnjs).getContentText()) But, if you don't want to use it, you can also use this script by copying and pasting the script of https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js to the script editor of Google Apps Script.但是,如果您不想使用它,也可以通过复制并粘贴https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js的脚本来使用此脚本https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js到 Google Apps 脚本的脚本编辑器。 By this, the process cost can be reduced.由此,可以降低工艺成本。

  • I can confirm that this method can be used for the current situation (January 27, 2023).我可以确认这种方法可以用于当前情况(2023 年 1 月 27 日)。 But, when the specification in the data and HTML is changed in the future update on the server side, this script might not be able to be used.但是,当数据和 HTML 中的规范在服务器端的未来更新中发生变化时,此脚本可能无法使用。 Please be careful about this.请注意这一点。

Reference:参考:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用于抓取 WSJ 和 Yahoo Finance 的 Google App 脚本 - Google App script for scraping WSJ and Yahoo Finance 如何使用 Google Apps 脚本从其对象中提取 Yahoo Finance 历史价格数据? - How to pull Yahoo Finance Historical Price Data from its Object with Google Apps Script? 适用于雅虎财经的 Google Apps 脚本返回空单元格 - Google Apps Script for Yahoo Finance Returns Empty Cell Google Apps 脚本:如何从 Yahoo Finance 检索动态命名对象的值 - Google Apps Script: How to retrieve values of dynamically named objects from Yahoo Finance 如何在 Google Apps 脚本中从 Yahoo Finance XHR 的 Complex Object 获取值 - How to get Values from Yahoo Finance XHR's Complex Object in Google Apps Script 来自 Yahoo Finance 的 Google Sheets 抓取选项链,结果不完整 - Google Sheets Scraping Options Chain from Yahoo Finance, Incomplete Results 在应用程序脚本中从 yahoo Finance API 返回历史价格数组 - Returning array of historical prices from yahoo finance API in apps script 触发运行 Google Sheets 脚本以批量从 Yahoo Finance 获取 URL 数据 - Trigger-run a Google Sheets script to fetch URL data from Yahoo Finance in batches 在 Apps Script 和 Google Sheets 中使用 ImportHTML 进行数据抓取 - Data Scraping With ImportHTML in Apps Script & Google Sheets 使用Google Apps脚本和Yahoo查询语言从网页获取数据 - Get data from webpage using Google Apps Script and Yahoo Query Language
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM