Web用R刮擦桌子

Question

i am trying to web scrape a table from pitch book web site . 我正在尝试从音调书网站上抓取一张桌子。 But using simple HTML does not work because pitch book uses java script instead of HTML to load the data so i need execute the JS in order to extract the info from the json file . 但是使用简单的HTML无效，因为宣传册使用Java脚本而不是HTML来加载数据，因此我需要执行JS才能从json文件中提取信息。 this is my code : 这是我的代码：

    library(httr)
    library(jsonlite)
    library(magrittr)  
    json=get("https://my.pitchbook.com/old/ 
    homeContent.64ea0536fd321cc1dd3b.js") %>% 
    content(as='text') %>% 
    fromJSON()

i get this error : 我收到此错误：

    Error in 
   get("https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js") 
    : 
     object 
  'https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js'
   not found

what ever data i am trying to load it returns the same error . 我尝试加载的任何数据都会返回相同的错误。 would appreciate your help :) thank you :) 会感谢您的帮助:)谢谢:)

Answer 1

You have called base::get and not httr::GET . 您已调用base::get而不是httr::GET 。 So it should be 所以应该

library(httr)
library(jsonlite)
library(magrittr)  
json <- GET(
  "https://my.pitchbook.com/old/homeContent.64ea0536fd321cc1dd3b.js"
) %>% 
  content("text") %>% 
  fromJSON()

but I'm not entirely sure that your website url gives a valid json. 但我不能完全确定您的网站网址是否提供了有效的json。 This in itself will give 这本身会给

lexical error: invalid char in json text. 词法错误：json文本中的char无效。

Web用R刮擦桌子

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-05-15 06:56:13

Web用R刮擦桌子

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-05-15 06:56:13

解决方案1
0 已采纳 2019-05-15 06:56:13