I am having issues web scraping news article titles and article descriptions from the following website: https://www.hrdive.com/ . The coding that I tried did not work. Can someone help me fix this coding in order for it to work?
for i in data.xpath("//li[@class='row feed__item']"):
title= i.xpath('//h3/a/text()')
article = i.xpath('//p[@class="feed__description"]/text()')
print(title, article)
The element you are targeting is still nested in several tags, div > h3 > a
, so you need to use //
to find it.
for i in data.xpath("//li[@class='row feed__item']"):
title = i.xpath('//h3/a/text()')
article = i.xpath('//p[@class='feed__description']/text()')
print(title, article)
Notice the double slash //
at the beginning
TIP:
You can test your xpath in the browser console, for example, in your case, you can go to https://www.hrdive.com/ and inspect/go to console and use $x
:
$x("//li[@class='row feed__item']//p[@class='feed__description']/text()")
// or
$x("//li[@class='row feed__item']//p[@class='feed__description']")[0].innerText
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.