How to IMPORTXML into Google Sheets from site with next-click/load-more pagination

Question

I'm trying to scrape a list of news stories for each story's topic, author, timestamp and headline. The site lists the 10 most recently published stories on a URL that ends in /all-stories, with the next 10 stories on /all-stories/page/2, the next 10 on /all-stories/page/3, and so on.

I have 3 IMPORTXML formulas that are capturing the data I need on the first page:

=importxml("https://www.example.org/all-stories", "//div[@class='post-item-river__content___2Ae_0']/a")

=IMPORTXML("https://www.example.org/all-stories","//li[@class='post-item-river__wrapper___2c_E- with-image']/div/div")

=IMPORTXML("https://www.example.org/all-stories","//li[@class='post-item-river__wrapper___2c_E- with-image']/div/h3")

How do I replicate this on page/2, page/3 and so on?

I haven't seen any way to do this in Google Sheets -- this kinda-similar story attempt involved adding &=ROW() to the URL in the formula. But when I tried that, Sheets interpreted it as part of the URL and rightly returned nothing.

Answer 1

try a simple array like:

={IMPORTXML("https://www.sciencenews.org/all-stories", "//div[@class='post-item-river__content___2Ae_0']");
  IMPORTXML("https://www.sciencenews.org/all-stories/page/2", "//div[@class='post-item-river__content___2Ae_0']");
  IMPORTXML("https://www.sciencenews.org/all-stories/page/3", "//div[@class='post-item-river__content___2Ae_0']")}

How to IMPORTXML into Google Sheets from site with next-click/load-more pagination

Question

1 answers

solution1
0 2019-10-15 13:37:59

How to IMPORTXML into Google Sheets from site with next-click/load-more pagination

Question

1 answers

solution1 0 2019-10-15 13:37:59

solution1
0 2019-10-15 13:37:59