简体   繁体   English

R readHTMLTable函数不起作用

[英]R readHTMLTable function not working

I have the following code written in R in which I would like to obtain some names from this particular webpage . 我在R中编写了以下代码,我希望从这个特定的网页中获取一些名称。

library(RCurl)
library(XML)
x <- getURL("http://www.encyclopedia-titanica.org/titanic-passengers-crew-lived/country-17/england.html")
x_2 <- htmlParse(x)
x_3 <- readHTMLTable(x_2) 

However, whenever I look at the contents of x_3, I get the following... 但是,每当我查看x_3的内容时,我都会得到以下内容......

x_3
named list()

It seems as though the readHTMLTable function is not able to obtain the tables. 似乎readHTMLTable函数无法获取表。 Can anyone help me obtain the names of the passengers from this web page without having to copy and paste? 任何人都可以帮助我从这个网页获取乘客的名字,而无需复制和粘贴? Much appreciated. 非常感激。

library(rvest)
library(dplyr)

base <- "http://www.encyclopedia-titanica.org/titanic-passengers-crew-lived/country-17/england.html"

# I use the older rvest package...`html` might be `read_html` now.Link to git repo below:
# https://github.com/hadley/rvest/blob/7d65d84e013b1bb3827ae0a2e05ddaed4875c112/R/parse.R
data_df <- (html(base) %>% html_table)[[1]]

knitr::kable(summary(data_df))

    |   |    Name         |    Age          | Class/Dept      |   Ticket        |   Joined        |    Job          |Boat [Body]      |             |
    |:--|:----------------|:----------------|:----------------|:----------------|:----------------|:----------------|:----------------|:------------|
    |   |Length:1190      |Length:1190      |Length:1190      |Length:1190      |Length:1190      |Length:1190      |Length:1190      |Mode:logical |
    |   |Class :character |Class :character |Class :character |Class :character |Class :character |Class :character |Class :character |NA's:1190    |
    |   |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Mode  :character |Mode  :character |NA           |

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM