I'm currently using the XML package in R programming, and the POST and xpathSApply
functions to do web crawling. When there are more than 2 values that satisfy the search criteria, I'd like to take just the first value.
In the image, I'd like to extract only the "짜증 나" part, located between <li>
and </li>
. Currently, I'm use the following command
tdReplace = xpathSApply(html, "//td[@class='tdReplace']/ul/li[2]/a", xmlValue)
without success. How should I go about fixing this?
Consider using rvest instead. It includes a function html_node()
, which returns the first instance of the matching node.
Without seeing your HTML it is difficult to test but to parse HTML from URL my_url
, something like this should work:
library(rvest)
my_url %>%
read_html() %>%
html_node("td.tdReplace ul li a") %>%
html_text()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.