带有 Selenium 和 Node.js 的递归 Facebook 页面 Webscraper

Question

What I try to do is to loop through an array of Facebook page IDs and to return the code from each event page.我尝试做的是遍历 Facebook 页面 ID 数组并从每个事件页面返回代码。 Unfortunately, I only get the code of the last page ID in the array but as many times as elements are in the array.不幸的是，我只得到数组中最后一个页面 ID 的代码，但与数组中元素的次数一样多。 Eg when I have 3 ID's in the array I get 3 times the code of the last page ID.例如，当我在数组中有 3 个 ID 时，我得到的是最后一页 ID 代码的 3 倍。

I already experimented with async await but I had no success.我已经尝试过 async await 但我没有成功。

The expected outcome would be the code of each page.预期的结果将是每个页面的代码。 Thank you for any help and examples.感谢您提供任何帮助和示例。

 //Looping through pages pages.forEach( function(page) { //Creating URL let url = "https://mbasic.facebook.com/"+page+"?v=events"; //Getting URL driver.get(url).then( function() { //Page loaded driver.getPageSource().then(function(result) { console.log(result); }); } ); } );

Answer 1

you faced the same issue i did when i created a scraper using python and selenium.当我使用 python 和 selenium 创建刮板时，您遇到了同样的问题。 Facebook has countermeasure on manual URL change, you cannot change it , i receive the same data again and again even though it was automated. Facebook 有手动更改 URL 的对策，您无法更改它，即使它是自动化的，我也会一次又一次地收到相同的数据。 in order to get a good result you need to have access of face books Graph API which provides a complete object of Facebook page with its pagination URL.为了获得良好的结果，您需要访问 Facebook Graph API，该 API 提供了完整的 Facebook 页面对象及其分页 URL。

or the second way i got it write was i used on click button of selenium browser automation to scroll down the next page.it wont work like you are typing , i prefer the usage of graph API或者我写的第二种方式是我在 selenium 浏览器自动化的点击按钮上向下滚动下一页。它不会像你打字那样工作，我更喜欢使用图形 API

带有 Selenium 和 Node.js 的递归 Facebook 页面 Webscraper

问题描述

1 个解决方案

解决方案1
0 2019-06-08 21:57:45

带有 Selenium 和 Node.js 的递归 Facebook 页面 Webscraper

问题描述

1 个解决方案

解决方案1 0 2019-06-08 21:57:45

解决方案1
0 2019-06-08 21:57:45