I'm doing work for college using R, how I can extract information " | 20 de Novembro de 2015 " using RVEST package? I tried to get the class "widget-info" but brought a "widget-author" class also
<div class="home-list-content">
<span class="widget-info">
<span class="widget-author">
Rúben Campanacho
</span>
| 20 de Novembro de 2015
</span>
<h2>
LG Pay é o sistema de pagamentos móveis da LG
</h2>
</div>
My code:
pagina <- read_html("http://www.tecnologia.com.pt")
data <- pagina %>%
html_nodes(".widget-info") %>%
html_text() %>%
as.data.frame()
The result:
Rúben Campanacho | 20 de Novembro de 2015
I want just | 20 de Novembro de 2015
txt <- 'Rúben Campanacho | 20 de Novembro de 2015'
gsub('^((\\w+)[[:space:]]){2}', '', txt)
Returns:
"| 20 de Novembro de 2015"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.