[英]web scraping a table with R
i am trying to web scrape a table from pitch book web site . 我正在尝试从音调书网站上抓取一张桌子。 But using simple HTML does not work because pitch book uses java script instead of HTML to load the data so i need execute the JS in order to extract the info from the json file .
但是使用简单的HTML无效,因为宣传册使用Java脚本而不是HTML来加载数据,因此我需要执行JS才能从json文件中提取信息。 this is my code :
这是我的代码:
library(httr)
library(jsonlite)
library(magrittr)
json=get("https://my.pitchbook.com/old/
homeContent.64ea0536fd321cc1dd3b.js") %>%
content(as='text') %>%
fromJSON()
i get this error : 我收到此错误:
Error in
get("https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js")
:
object
'https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js'
not found
what ever data i am trying to load it returns the same error . 我尝试加载的任何数据都会返回相同的错误。 would appreciate your help :) thank you :)
会感谢您的帮助:)谢谢:)
You have called base::get
and not httr::GET
. 您已调用
base::get
而不是httr::GET
。 So it should be 所以应该
library(httr)
library(jsonlite)
library(magrittr)
json <- GET(
"https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js"
) %>%
content("text") %>%
fromJSON()
but I'm not entirely sure that your website url gives a valid json. 但我不能完全确定您的网站网址是否提供了有效的json。 This in itself will give
这本身会给
lexical error: invalid char in json text.
词法错误:json文本中的char无效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.