Trying to monitor changes on this page: at5.nl/zoek/pijp . "pijp" is a query keyword here. It shows a list of articles with the latest on top:
[enter image description here][1] When I scrape this page with curl or wget (example attached) I don't see any changes in the resulting file over time or using different keywords. Examining the content of the file (obviously) there's nothing related to the content I see in my browser. Coming across a lot of javascript. My first goal is just to see if something changes in browser output from a script. The script checks this every 5 minutes and then sends an @mail in case of changes.
As you might have guessed I am definitely no web developer. Any pointers as to how I could scrape my desired changes? (Fairly proficient with bash)
Here's a link to the file I get with cURL:
https://drive.google.com/file/d/1-QzoTgbqc_m96YOx6qBh1eIBDyD5HfW_/view?usp=sharing
As @James pointed out, you could use the API-url and parse the resulting JSON to your liking. The JSON-parser xidel can help you out:
$ xidel -s \
-d '{{"searchTerm":"pijp"}}' \
"https://ditisdesupercooleappapi.at5.nl/api/search" \
-e '$json/(articles)()[created gt (current-dateTime() - dateTime("1970-01-01T00:05:00Z")) div dayTimeDuration("PT1S")]'
"pijp" (as a value in a JSON object) is sent (POST-request) to the API-url, after which the resulting JSON is parsed in such a way that it will only return those articles that have a created
attribute whose value (an Epoch timestamp ) is only 5 minutes old.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.