简体   繁体   中英

Scrape data using R from Javascript pop-up window

I want to scrape the contents of the pop-up window called "Constraints" from this site: https://dataviewer.pjm.com/dataviewer/pages/public/lmp.jsf (pop-up window shows after clicking the Constraints link on left side).

I need to get the Constraint, Contingency, and Shadow Price data show below. Using SelectorGadget, I identified that info as "#frmConstraints\\:tblConstraints_data .col-left"

选择器小工具结果

I can see the info I want here (the info with class "col-left"): 开发者代码

I ran this R code, to no avail. const_info returned nothing.

library(rvest)
library(stringr)
library(plyr)
library(dplyr)
library(ggvis)
library(knitr)
options(digits = 4)

session <-
 rvest::html_session('https://dataviewer.pjm.com/dataviewer/pages/public/lmp.jsf')

constraints_page <- 
rvest::follow_link(x=session,css='#formLeftPanel\\:constraintLink')

constraints_html <- xml2::read_html(constraints_page)

const_info <- constraints_html %>% 
rvest::html_nodes('#frmConstraints\\:tblConstraints_data .col-left') %>% 
rvest::html_text()

I also ran PhantomJS to turn it into an html page, but the info I want it not there.

phantom_html页面

To get the above, I ran the following code using PhantomJS.

// scrape_dataviewer.js

var webPage = require('webpage');
var page = webPage.create();

var fs = require('fs');
var path = 'dataviewer.html'

page.open('https://dataviewer.pjm.com/dataviewer/pages/public/lmp.jsf', function (status) {
  var content = page.content;
  fs.write(path,content,'w')
  phantom.exit();
});

I am familiar with R and rvest, and even PhantomJS. I see I might need the R package V8. But at the end of the day, I cannot get this info scraped.

I couldn't get to an answer, but I've taken it at far as I could without additional research. This gets you to the table with the data you want, but I can only return the dates. I believe I need to select each date in the html session and then pull out the data associated with each date. Below is my code:

session <- rvest::html_session('https://dataviewer.pjm.com/dataviewer/pages/public/lmp.jsf')
constraints_page <- rvest::follow_link(x=session,css='#formLeftPanel\\:constraintLink')
constraints_html <- xml2::read_html(constraints_page)
constraints_html %>% 
  rvest::html_nodes('#frmConstraints') %>% 
  rvest::html_text()

I was going to add this to comments, but I don't have enough reputation points, it's not a full answer sorry!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM