简体   繁体   中英

Rvest to manipulate and extract value from HTML

Using RI have getBrandlist in html as

 <div>\n  <span class="txt edittext">BrandName1 </span>\n  <span 
 class="cnt" data-val="116">(42)</span>\n</div>
 <div>\n  <span class="txt edittext">BrandName2 </span>\n  <span 
 class="cnt" data-val="116">(62)</span>\n</div> 
 ......

Now I have the number 62. I wish to extract BrandName2 that corresponds to this value. I tried using html_node(getBrandlist, css = '.cnt') %>% html_attr() How do I go about this. Any help will be greatly appreciated.

You can do

library(rvest)
doc <- read_html('<div>\n  <span class="txt edittext">BrandName1 </span>\n  <span 
 class="cnt" data-val="116">(42)</span>\n</div>
 <div>\n  <span class="txt edittext">BrandName2 </span>\n  <span 
 class="cnt" data-val="116">(62)</span>\n</div> ')
html_node(doc, xpath = "//span[text()='(62)']/preceding-sibling::span") %>% html_text
# [1] "BrandName2 "

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM