[英]Using R to scrape data from a table populated possibly with javascript
Hello fellow R fanatics... 大家好,R狂热分子...
I've been using R to scrape data from a variety of websites for a while now, however this one has me stumped. 我一直在使用R从各种各样的网站上抓取数据一段时间,但是这个让我很困惑。
I am trying to scrape the data from the following table: http://www.vigimeteo.com/PREV/obs/obs_seul.html?a=07005&b= 我正在尝试从下表中抓取数据: http : //www.vigimeteo.com/PREV/obs/obs_seul.html?a=07005&b=
However my efforts thus far have failed. 但是到目前为止,我的努力失败了。
I have tried the following 我尝试了以下
A combination of getURL and readHTMLTable getURL和readHTMLTable的组合
thisURL = http://www.vigimeteo.com/PREV/obs/obs_seul.html?a=07005&b= theURL = getURL(thisURL,.opts = list(ssl.verifypeer = FALSE) ) tables = readHTMLTable(theURL)
This results in an empty table 这导致一个空表
It appears maybe R's Selenium package could have a potential solution , but I haven't yet figured out how to use it here, probably due to unfamiliarity 看来R的Selenium软件包可能有潜在的解决方案 ,但我可能还由于不熟悉,所以我还没有弄清楚如何使用它
I feel like I'm just missing an essential part here... perhaps due to my lack of knowledge of JS and XML? 我觉得我只是在这里缺少必要的部分……也许是由于我对JS和XML缺乏了解?
UPDATE : 更新 :
I've noticed that if I right-click on the table element and use Chrome's "inspect" it generates HTML that has all of the table's values in it and would be very scrape-able... I'm still not sure how to get to this point in R though. 我注意到,如果我右键单击table元素并使用Chrome的“检查”,它会生成HTML,其中包含表的所有值,并且非常容易抓取...我仍然不确定如何在R中达到这一点。 Anyone have hints on where to look in the "inspect" screen to try and guide my progress? 任何人都可以在“检查”屏幕上找到提示,以尝试并指导我的进度?
The solution to this was the following. 解决方案如下。
Thanks to @XR SC for his answer here: web scraping using Chrome Dev Tools for providing the basic approach. 感谢@XR SC在此提供的答案: 使用Chrome开发工具提供的基本方法进行网页抓取 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.