[英]Google Apps Script stopped scraping data from Yahoo Finance
This Google Apps Script code to scrape historical data from from Yahoo Finance stopped working yesterday.这个用于从 Yahoo Finance 抓取历史数据的 Google Apps 脚本代码昨天停止工作。 It suddenly gives the error - No data (data.length == 0).
它突然给出错误 - 没有数据(data.length == 0)。
I think the bug is in the line 8 script while getting the JSON but I dont't have the necessary skill to fix it.我认为在获取 JSON 时,错误出现在第 8 行脚本中,但我没有必要的技能来修复它。
It woud be appreciate your help with issue.感谢您对问题的帮助。
function Scrapeyahoo(symbol) {
//Leemos de yahoo finance historical data
const s = encodeURI(symbol); // so that it works with a url
// turn it into an URL and call it
const url = 'https://finance.yahoo.com/quote/' +s +'/history?p=' +s;
const res = UrlFetchApp.fetch(url, { muteHttpExceptions: true }).getContentText();
const $ = Cheerio.load(res);
const data = $('script').toArray().reduce((ar, x) => {
const c = $(x).get()[0].children;
if (c.length > 0) {
const d = c[0].data.trim().match(/({"context"[\s\S\w]+);\n}\(this\)\);/);
if (d && d.length == 2) {
ar.push(JSON.parse(d[1]));
}
}
return ar;
}, []);
if (data.length == 0) throw new Error("No data.");
const header = ["date", "open", "high", "low", "close", "adjclose", "volume"];
var key = Object.entries(data[0]).find(([k]) => !["context", "plugins"].includes(k))[1];
if (!key) return;
const cdnjs = "https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js";
eval(UrlFetchApp.fetch(cdnjs).getContentText());
const obj1 = data[0];
const obj2 = JSON.parse(CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key)));
const ar = obj2.HistoricalPriceStore.prices.map(o => header.map(h => h == "date" ? new Date(o[h] * 1000) : (o[h] || "")));
// ---
return ar
}
The original code was modified in December according to this solution , after it stopped working, but I can't find a solution for the issue now.原来的代码是在12月根据这个解决方案修改的,在它停止工作之后,但我现在找不到解决这个问题的方法。
It seems that the specification for retrieving the key has been changed.似乎检索密钥的规范已更改。 In this case,
var key = Object.entries(data[0]).find(([k]) =>,["context". "plugins"];includes(k))[1];
在这种情况下,
var key = Object.entries(data[0]).find(([k]) =>,["context". "plugins"];includes(k))[1];
doesn't return the correct key.不返回正确的密钥。 By this, an error occurs at
CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key))
.这样,在
CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj1.context.dispatcher.stores, key))
发生错误。
In the current stage, when I saw this script , the modified script is as follows.现阶段看到这个脚本,修改后的脚本如下。
function Scrapeyahoo(symbol) {
const s = encodeURI(symbol);
const url = 'https://finance.yahoo.com/quote/' +s +'/history?p=' +s;
var html = UrlFetchApp.fetch(url).getContentText().match(/root.App.main = ([\s\S\w]+?);\n/);
if (!html || html.length == 1) return;
var obj = JSON.parse(html[1].trim());
var key = [...new Map(Object.entries(obj).filter(([k]) => !["context", "plugins"].includes(k)).splice(-4)).values()].join("");
if (!key) return;
const cdnjs = "https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js";
eval(UrlFetchApp.fetch(cdnjs).getContentText());
const obj1 = JSON.parse(CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj.context.dispatcher.stores, key)));
const header = ["date", "open", "high", "low", "close", "adjclose", "volume"];
const ar = obj1.HistoricalPriceStore.prices.map(o => header.map(h => h == "date" ? new Date(o[h] * 1000) : (o[h] || "")));
return ar
}
Cheerio
is not used.Cheerio
。 In this sample, in order to load crypto-js
, eval(UrlFetchApp.fetch(cdnjs).getContentText())
is used.在此示例中,为了加载
crypto-js
,使用了eval(UrlFetchApp.fetch(cdnjs).getContentText())
。 But, if you don't want to use it, you can also use this script by copying and pasting the script of https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js
to the script editor of Google Apps Script.但是,如果您不想使用它,也可以通过复制并粘贴
https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js
的脚本来使用此脚本https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js
到 Google Apps 脚本的脚本编辑器。 By this, the process cost can be reduced.由此,可以降低工艺成本。
I can confirm that this method can be used for the current situation (January 27, 2023).我可以确认这种方法可以用于当前情况(2023 年 1 月 27 日)。 But, when the specification in the data and HTML is changed in the future update on the server side, this script might not be able to be used.
但是,当数据和 HTML 中的规范在服务器端的未来更新中发生变化时,此脚本可能无法使用。 Please be careful about this.
请注意这一点。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.