簡體   English   中英

使用 rvest package 中的 html_table function 時出錯

[英]Error when using the html_table function from the rvest package

我正在嘗試在登錄后的網頁上使用rvest執行一些網絡抓取,我已經成功連接到 web 頁面並可以訪問 HTML。(對於那些感興趣的人,我正在抓取夢幻橄欖球運動員的統計數據)。

我正在嘗試使用此代碼將數據傳遞到數據框中:

loginsession %>% 
   read_html() %>% 
   html_elements('.general') %>%
   html_table(fill = T) %>% 
   data.frame()

但是我遇到了這個錯誤:

Error in matrix(unlist(values), ncol = width, byrow = TRUE) : 'data' must be of a vector type, was 'NULL'

html 內容如下:

 [1] <div class="item hider general club" style="text-align: left"><strong>Club</strong></div>\n
 [2] <div class="item hider general nationality" style="text-align: left"><strong>Nat</strong></div>\n
 [3] <div class="item hider general salary"><strong>Salary</strong></div>\n
 [4] <div class="item hider general points"><strong>Points</strong></div>\n
 [5] <div class="item hider general selectionCount"><strong>Selected</strong></div>\n
 [6] <div class="item hider general internationalCaps"><strong>Caps</strong></div>\n
 [7] <div class="item hider general age"><strong>Age</strong></div>\n
 [8] <div class="item hider general recommendation scout-report"><strong>Recm</strong></div>\n
 [9] <div class="item hider general form-display" style="text-align: left"><strong>Form</strong></div>\n
[10] <div class="item hider general averageRating"><strong>Avg</strong></div>\n
[11] <div class="item hider general minutesPlayed"><strong>Mins</strong></div>\n
[12] <div class="item hider general pointsPerGame"><strong>Pts/80</strong></div>\n
[13] <div class="item hider general attackingPointsPerGame"><strong>Att/80</strong></div>\n
[14] <div class="item hider general defensivePointsPerGame"><strong>Def/80</strong></div>\n
[15] <div class="item hider general kickingPointsPerGame"><strong>K/80</strong></div>\n
[16] <div class="item hider general club" style="text-align: left">\n<div class="logo full"><a class="popup" data-url="/fant ...
[17] <div class="item hider general nationality" style="text-align: left">\n<i class="flag GB-ENG"></i><span class="hide-for ...
[18] <div class="item hider general salary">\n                                                                               ...
[19] <div class="item hider general points">\n                                                                               ...
[20] <div class="item hider general selectionCount">\n                                                            13%\n      ...

html_table需要一個 HTML 表定義(帶有標簽 TABLE、TH、TR、TD …)。 您的 HTML 是一系列分區 (DIV),大概樣式為 CSS (class = "..."),看起來像HTML 表。

您可以嘗試使用CSS-selectors將所需元素提取為向量。 例子:

## raw html %>%
html_node("div.item.hider.general > strong")

此處查找有關 CSS 選擇器的教程

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM