简体   繁体   English

解析网站的多个页面并计算总数

[英]Parsing multiple pages of website and count total items

My script simply gathers the number of reports on a page, then goes to the next page and does the same. 我的脚本仅收集页面上的报告数,然后转到下一页并执行相同的操作。 The goal is to get the total number of reports across multiple pages. 目的是获得跨多个页面的报告总数。

UPDATED 更新

var casper = require('casper').create({
    clientScripts: ["./lib/jquery-2.1.3.min.js"],
    // verbose: true,
    logLevel: "debug"
});

casper.on('remote.message', function(msg) {
    this.echo('LOG: ' + msg);
});

casper.on('page.error', function (msg, trace) {
    this.echo( 'Error: ' + msg, 'ERROR' );
});

var reportCount, newReportCount, reportPages;

casper.start("reports.html", function() {

    reportPages = this.evaluate(function() {
        return $('#table2 tbody tr td').children('a').length -1;
  });

  //first page of reports
  reportCount = this.evaluate(function() {
      return $('#table1 tbody').first().children('tr').length;
  });

  this.echo('initial count: ' + reportCount);
  this.echo('pages: ' + reportPages);

  //check if more than 1 page and add report count
  if (reportPages > 1) {
    newReportCount = this.thenOpen('reports2.html', function(){
        var newCount = this.evaluate(function(count) {
            add = count + $('#table1 tbody').first().children('tr').length;
            // console.log('new count inside: ' + add);
            return add;
        }, reportCount);
        console.log(newCount); //this shows correct new value 32
    });
    console.log(newReportCount); //this shows [object Casper]

    neoReportCount = this.thenOpen('reports3.html', function(count){
        console.log(newReportCount); //this shows [object Casper]
        //do the same count
    }, newReportCount);
  }

casper.run();

Here is the output in console 这是控制台中的输出

Pages: 3
First count: 15
[object Casper], currently at file:///**/reports.html
32
[object Casper], currently at file:///**/reports3.html

Yes, it is possible, but you use casper.thenOpenAndEvaluate() which has the word then in it. 是的,这是可能的,但是你用casper.thenOpenAndEvaluate()其中有字then在里面。 It means that this function is asynchronous and it returns the casper object to enable a builder/promise pattern. 这意味着该函数是异步的,它返回casper对象以启用构建器/承诺模式。 So you cannot return anything from a function like this. 因此,您无法从此类函数返回任何内容。 Since it is asynchronous, it will be executed after the current step ends, which is after console.log(newCount); 由于它是异步的,因此它将在当前步骤结束之后console.log(newCount);之后console.log(newCount); .

You would need to split the function, for example like this: 您将需要拆分函数,例如:

//check if more than 1 page and add report count
if (reportPages > 1) {
  var newCount;
  this.thenOpen('reports2.html', function(count){
    newCount = this.evaluate(function(count){
      add = count + $('#table1 tbody').first().children('tr').length;
      console.log('new count inside: ' + add);
      return add;
    }, reportCount);
    console.log(newCount);
  }).thenOpen('reports3.html', function(count){
    newCount += this.evaluate(function(count){
      add = count + $('#table1 tbody').first().children('tr').length;
      console.log('new count inside: ' + add);
      return add;
    }, reportCount);
    console.log(newCount);
  }).then(function(){
    console.log(newCount);
  });
}

It seems like you want to loop over multiple pages. 似乎您想循环浏览多个页面。 This is usually done recursively, because CasperJS is asynchronous and you don't know beforehand how many pages you need to open. 这通常是递归完成的,因为CasperJS是异步的,并且您事先不知道需要打开多少页。 I suggest you look at this question for some examples: CasperJS loop or iterate through multiple web pages? 我建议您看一些示例的问题: CasperJS循环还是遍历多个网页?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM