[英]Web Scraping in R using getURL
Hi I am trying to read data of the World's powerfl brands from the link " http://www.forbes.com/powerful-brands/list/3/#tab:rank " into a data fame using R 嗨,我正在尝试使用R将链接为“ http://www.forbes.com/powerful-brands/list/3/#tab:rank ”的世界powerfl品牌的数据读取为数据。
I am a beginner so I tried using the following code to retrieve the data 我是一个初学者,所以我尝试使用以下代码来检索数据
library(XML)
library(RCurl)
# Read and parse HTML file
forbe = 'http://www.forbes.com/powerful-brands/list/#tab:rank'
data <- getURL('http://www.forbes.com/powerful-brands/list/#tab:rank')
data
htmldata <- readHTMLTable(data)
htmldata
Could anyone please help me in retrieving data from the webpage mentioned 任何人都可以帮助我从提到的网页中检索数据
They use XHR requests to populate the page via javascript. 他们使用XHR请求通过javascript填充页面。 Use browser Developer Tools to see the Network requests
使用浏览器开发人员工具查看网络请求
and grab the JSON directly: 并直接获取JSON:
brands <- jsonlite::fromJSON("http://www.forbes.com/ajax/list/data?year=2015&uri=powerful-brands&type=organization")
str(brands)
## 'data.frame': 100 obs. of 10 variables:
## $ position : int 12 44 83 87 13 22 1 39 16 72 ...
## $ rank : int 12 44 83 87 13 22 1 39 16 72 ...
## $ name : chr "AT&T" "Accenture" "Adidas" "Allianz" ...
## $ uri : chr "att" "accenture" "adidas" "allianz" ...
## $ imageUri : chr "att" "accenture" "adidas" "allianz" ...
## $ industry : chr "Telecom" "Business Services" "Apparel" "Financial Services" ...
## $ revenue : num 132400 32800 14900 131600 87500 ...
## $ oneYearValueChange: int 17 14 -14 -6 32 13 17 1 -5 -1 ...
## $ brandValue : num 29100 12000 6800 6600 28100 ...
## $ advertising : num 3272 88 NA NA 3300 ...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.