Web Scraping News Articles

Question

I am having issues web scraping news article titles and article descriptions from the following website: https://www.hrdive.com/ . The coding that I tried did not work. Can someone help me fix this coding in order for it to work?

   for i in data.xpath("//li[@class='row feed__item']"):
   title= i.xpath('//h3/a/text()')
   article = i.xpath('//p[@class="feed__description"]/text()')
   print(title, article)

Answer 1

The element you are targeting is still nested in several tags, div > h3 > a , so you need to use // to find it.

for i in data.xpath("//li[@class='row feed__item']"):
   title = i.xpath('//h3/a/text()')
   article = i.xpath('//p[@class='feed__description']/text()')
   print(title, article)

Notice the double slash // at the beginning

TIP:

You can test your xpath in the browser console, for example, in your case, you can go to https://www.hrdive.com/ and inspect/go to console and use $x :

$x("//li[@class='row feed__item']//p[@class='feed__description']/text()")

// or

$x("//li[@class='row feed__item']//p[@class='feed__description']")[0].innerText

Web Scraping News Articles

Question

1 answers

solution1
0 ACCPTED 2020-03-27 03:01:10

Web Scraping News Articles

Question

1 answers

solution1 0 ACCPTED 2020-03-27 03:01:10

solution1
0 ACCPTED 2020-03-27 03:01:10