Web scraping but not scraping changes

Question

Trying to monitor changes on this page: at5.nl/zoek/pijp . "pijp" is a query keyword here. It shows a list of articles with the latest on top:

[enter image description here][1] When I scrape this page with curl or wget (example attached) I don't see any changes in the resulting file over time or using different keywords. Examining the content of the file (obviously) there's nothing related to the content I see in my browser. Coming across a lot of javascript. My first goal is just to see if something changes in browser output from a script. The script checks this every 5 minutes and then sends an @mail in case of changes.

As you might have guessed I am definitely no web developer. Any pointers as to how I could scrape my desired changes? (Fairly proficient with bash)

Here's a link to the file I get with cURL:

https://drive.google.com/file/d/1-QzoTgbqc_m96YOx6qBh1eIBDyD5HfW_/view?usp=sharing

Answer 1

As @James pointed out, you could use the API-url and parse the resulting JSON to your liking. The JSON-parser xidel can help you out:

$ xidel -s \
  -d '{{"searchTerm":"pijp"}}' \
  "https://ditisdesupercooleappapi.at5.nl/api/search" \
  -e '$json/(articles)()[created gt (current-dateTime() - dateTime("1970-01-01T00:05:00Z")) div dayTimeDuration("PT1S")]'

"pijp" (as a value in a JSON object) is sent (POST-request) to the API-url, after which the resulting JSON is parsed in such a way that it will only return those articles that have a created attribute whose value (an Epoch timestamp ) is only 5 minutes old.

Web scraping but not scraping changes

Question

1 answers

solution1
0 2022-08-27 11:29:17

Web scraping but not scraping changes

Question

1 answers

solution1 0 2022-08-27 11:29:17

solution1
0 2022-08-27 11:29:17