简体   繁体   中英

get url table into a `data.frame` R-XML-RCurl

I'm trying to get the table of an url into a data.frame . In other examples I found the following code worked:

library(XML)
library(RCurl)
theurl <- "https://es.finance.yahoo.com/q/cp?s=BEL20.BR"
tables <- readHTMLTable(theurl)

As the warning says the table doesn't seem to be XML

Warning message: XML content does not seem to be XML: 'https://es.finance.yahoo.com/q/cp?s=BEL20.BR'

Alternatively, getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R") works but don't know how to extract the table. Any help would be appreciated.

EDIT: thanks to @har07 using table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab gives the output but still have to be filtered.

You can get the table if you use getURL to get the document content. Sometimes readHTMLTable has trouble getting content. In those cases, it is recommended to try getURL

> library(XML)
> library(RCurl)
> URL <- getURL("https://es.finance.yahoo.com/q/cp?s=BEL20.BR")
> rt <- readHTMLTable(URL, header = TRUE)
> rt

You might need to adjust the header argument and possibly others, but the tables are there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM