[英]WebScraping with Node.js
I was wondering if someone can give me an example of how to web scrape multiple pages with Node.js?我想知道是否有人可以给我一个如何使用 Node.js 抓取多个页面的示例? I am finding examples online but I want to scrape a search result webpage.
我正在网上查找示例,但我想抓取搜索结果网页。 Then if there is a "next" button to move to the next page and view more results, I want to move to the next page and scrape it as well.
然后,如果有一个“下一步”按钮可以移到下一页并查看更多结果,我想移到下一页并将其刮掉。
Has anyone done something similar to this?有没有人做过类似的事情?
Thanks!谢谢!
I managed something like this to work using nightmare.js .我使用nightmare.js管理这样的工作。 It let's you
click('#someElement')
and wait('#someElement')
, getting the page content between these actions using evaluate
.它让你
click('#someElement')
和wait('#someElement')
,使用evaluate
获取这些操作之间的页面内容。 It must be in websites that let you do that , note that you may need a while
using the nightmare.exists
or a for
using the page count, for that you may need to use a query selector that can get all specified elements, like document.querySelectorAll('.nextPageElement').length
(using the querySelectorAll ) to get that count.它必须是网站,让你这样做,请注意您可能需要
while
使用nightmare.exists
或for
使用页数,对于您可能需要使用的查询选择,可以让所有指定的元素,如document.querySelectorAll('.nextPageElement').length
(使用querySelectorAll )来获得该计数。 Just try to keep using variable lifting when needed and don't fall in a callback hell then nightmare.js will do the job.只需尝试在需要时继续使用变量提升,不要陷入回调地狱,然后 nightmare.js 将完成这项工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.