I am fairly new to web-scraping but as part of a project I am working on im trying to scrape details of classes from this timetable https://www101.dcu.ie/timetables/feed.php?prog=case&per=2&week1=19&week2=30&day=7&hour=1-20&template=Studprog I'm going to try use jsoup but amen't sure how exactly to parse the data in the way that would return only the relevant information. Any help or insight would be greatly appreciated
You can use iconv
and cheerio
.
I made a functional example for you to see:
const rp = require('request-promise');
const iconv = require('iconv-lite');
const cheerio = require('cheerio');
const getRequestDefault = (method) => (url) =>
rp({
encoding: null,
method: method,
uri: url,
rejectUnauthorized: false
})
.then(html => {
const $ = cheerio.load(
iconv.decode(
new Buffer(html), "ISO-8859-1"
)
);
return $;
})
const getRows = () =>
getRequestDefault('GET')(`https://www101.dcu.ie/timetables/feed.php?prog=case&per=2&week1=19&week2=30&day=7&hour=1-20&template=Studprog`)
.then($ => {
// Example
$('table tbody tr')
.toArray()
.forEach(
a => {
console.log($(a).text());
}
);
});
getRows();
This is going to scrap all the fields tr
of all the tables.
You can use this as a starting point. Just copy the code into a .js file, install the dependencies and use: node file.js
To install the dependencies: npm install cheerio iconv request request-promise
I hope it helps you
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.