[英]Grab data from website HTML table and transfer to Google Sheets using App-Script
Ok, I know there are similar questions out there to mine, but so far I have yet to find any answers that work for me.好的,我知道我有类似的问题,但到目前为止,我还没有找到任何适合我的答案。 What I am trying to do is gather data from an entire HTML table on the web ( https://www.sports-reference.com/cbb/schools/indiana/2022-gamelogs.html ) and then parse it/transfer it to a range in my Google Sheet. What I am trying to do is gather data from an entire HTML table on the web ( https://www.sports-reference.com/cbb/schools/indiana/2022-gamelogs.html ) and then parse it/transfer it to我的 Google 表格中的一个范围。 The code below is probably the closest thing I've found so far because at least it doesn't error out, but it will only find one string or value, not the whole table.下面的代码可能是迄今为止我发现的最接近的东西,因为至少它不会出错,但它只会找到一个字符串或值,而不是整个表。 I've found other answers where they use xmlservice.parse, however that doesn't work for me, I believe because the HTML format has issues that it can't parse.我找到了他们使用 xmlservice.parse 的其他答案,但这对我不起作用,我相信因为 HTML 格式存在无法解析的问题。 Does anyone have an idea of how to edit what I have below, or a whole new idea that may work for this website?有没有人知道如何编辑我在下面的内容,或者一个可能适用于本网站的全新想法?
function SAMPLE() {
const url="http://www.sports-reference.com/cbb/schools/indiana/2022-gamelogs.html#sgl-basic?"
// Get all the static HTML text of the website
const res = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
// Find the index of the string of the parameter we are searching for
index = res.search("td class");
// create a substring to only get the right number values ignoring all the HTML tags and classes
sub = res.substring(index+92,index+102);
Logger.log(sub);
return sub;
}
I understand that I can use importHTML natively in a Google Sheet, and that's what I'm currently doing.我知道我可以在 Google 表格中本地使用 importHTML,这就是我目前正在做的事情。 However I am doing this for over 350 webpage tables, and iterating through each one to load it and then copy the value to another sheet.但是,我正在为超过 350 个网页表执行此操作,并遍历每个表以加载它,然后将值复制到另一张表。 App Script bogs down quite a bit when it is repeatedly waiting on Sheets to load an importHTMl and then grab some data and do it all over again on another url.当 App Script 反复等待 Sheets 加载 importHTMl 然后抓取一些数据并在另一个 url 上重新执行时,它会陷入相当多的困境。 I apologize for any formatting issues in this post or things I've done wrong, this is my first time posting here.对于这篇文章中的任何格式问题或我做错的事情,我深表歉意,这是我第一次在这里发帖。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.