繁体   English   中英

在R中进行Webscraping,“...当前工作目录中不存在”错误

[英]Webscraping in R, “… does not exist in current working directory” error

我正在尝试使用XML2包从ESPN.com中删除一些表。 为了举个例子,我想把第7周的幻想四分卫排名变成R,其URL是:

http://www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-quarterback-rankings

我正在尝试使用“read_html()”函数来执行此操作,因为这是我最熟悉的。 这是我的语法和错误:

> wk.7.qb.rk = read_html("www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks", which = 1)
Error: 'www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks' does not exist in current working directory ('C:/Users/Brandon/Documents/Fantasy/Football/Daily').

我也试过“read_xml()”,只是为了得到同样的错误:

> wk.7.qb.rk = read_xml("www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks", which = 1)
Error: 'www.espn.com/fantasy/football/story/_/page/16ranksWeek7QB/fantasy-football-week-7-rankings-quarterbacks' does not exist in current working directory ('C:/Users/Brandon/Documents/Fantasy/Football/Daily').

为什么R在工作目录中查找此URL? 我已尝试使用其他URL的此功能,并取得了一些成功。 这个特定的网址是什么让它看起来与其他网站不同? 而且,我该如何改变呢?

当我在循环中运行read_html以浏览20页时,我收到此错误。 在第20页之后,循环仍在运行,没有url,因此它开始使用NAs调用read_html进行其他循环迭代。希望这有帮助!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM