[英]Error when using the html_table function from the rvest package
我正在嘗試在登錄后的網頁上使用rvest
執行一些網絡抓取,我已經成功連接到 web 頁面並可以訪問 HTML。(對於那些感興趣的人,我正在抓取夢幻橄欖球運動員的統計數據)。
我正在嘗試使用此代碼將數據傳遞到數據框中:
loginsession %>%
read_html() %>%
html_elements('.general') %>%
html_table(fill = T) %>%
data.frame()
但是我遇到了這個錯誤:
Error in matrix(unlist(values), ncol = width, byrow = TRUE) : 'data' must be of a vector type, was 'NULL'
html 內容如下:
[1] <div class="item hider general club" style="text-align: left"><strong>Club</strong></div>\n
[2] <div class="item hider general nationality" style="text-align: left"><strong>Nat</strong></div>\n
[3] <div class="item hider general salary"><strong>Salary</strong></div>\n
[4] <div class="item hider general points"><strong>Points</strong></div>\n
[5] <div class="item hider general selectionCount"><strong>Selected</strong></div>\n
[6] <div class="item hider general internationalCaps"><strong>Caps</strong></div>\n
[7] <div class="item hider general age"><strong>Age</strong></div>\n
[8] <div class="item hider general recommendation scout-report"><strong>Recm</strong></div>\n
[9] <div class="item hider general form-display" style="text-align: left"><strong>Form</strong></div>\n
[10] <div class="item hider general averageRating"><strong>Avg</strong></div>\n
[11] <div class="item hider general minutesPlayed"><strong>Mins</strong></div>\n
[12] <div class="item hider general pointsPerGame"><strong>Pts/80</strong></div>\n
[13] <div class="item hider general attackingPointsPerGame"><strong>Att/80</strong></div>\n
[14] <div class="item hider general defensivePointsPerGame"><strong>Def/80</strong></div>\n
[15] <div class="item hider general kickingPointsPerGame"><strong>K/80</strong></div>\n
[16] <div class="item hider general club" style="text-align: left">\n<div class="logo full"><a class="popup" data-url="/fant ...
[17] <div class="item hider general nationality" style="text-align: left">\n<i class="flag GB-ENG"></i><span class="hide-for ...
[18] <div class="item hider general salary">\n ...
[19] <div class="item hider general points">\n ...
[20] <div class="item hider general selectionCount">\n 13%\n ...
html_table
需要一個 HTML 表定義(帶有標簽 TABLE、TH、TR、TD …)。 您的 HTML 是一系列分區 (DIV),大概樣式為 CSS (class = "..."),看起來像HTML 表。
您可以嘗試使用CSS-selectors將所需元素提取為向量。 例子:
## raw html %>%
html_node("div.item.hider.general > strong")
在此處查找有關 CSS 選擇器的教程
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.