简体   繁体   中英

Count DOM elements with CasperJS' evaluate failed

I've just got trouble with CasperJS.

I need to count table rows, since there are many tables that contain same information structure (lets say user table) So I do some casperjs evaluate code like this

var table_rows1 = casper.evaluate(function(it){ return jQuery("#dResult > div:nth-child("+1+") > div > div:nth-child(4) > div:nth-child("+it+") > div:nth-child(1) > span > div").length; }, it);

It is iterator, it will increasing until reach the table element length.

So lets say we have 3 tables, so it will be 1,2,3 For it = 1, there are no problem occured, its print out the correct table[1] rows number. But for next it, 2,3, it just print 1 for table rows number. How it could be become so weird like this?

This is my CasperJS snippet:

function getNumber(it){
     window.__utils__.echo("it :"+it);
     var query = "#dResult > div:nth-child("+1+") > div > div:nth-child(4) > div:nth-child("+it+") > div:nth-child(1) > span > div";
     return jQuery(query).length;
}
var table_rows1 = casper.evaluate(getNumber, 1);
var table_rows2 = casper.evaluate(getNumber, 2);
var table_rows3 = casper.evaluate(getNumber, 3);

this.echo("table rows #1 :"+table_rows1);
this.echo("table rows #2 :"+table_rows2);
this.echo("table rows #3 :"+table_rows3);

And this is the html which I need to scrape...

This is for 1 table, there are many html tags like this

<div class="padd-b-rates">
        <div id="showRateWSMA0511000015CL096-CL124">
            <div class="bd-rate-in">
                <div class="rth1"><b>Room Category </b></div>
                <div class="rth2"><b>Breakfast</b></div>
                <div class="rth3"><b>Total Stay</b></div>
                <div class="rth4"><b>Room Status</b></div>
                <div class="clear"></div>
            </div>
            <span id="RateWSMA0511000015CL096-CL124">
    <div class="bd-rate-row">
        <div class="rtd1"><span class="rmname-rsht-rate">DELUXE (NRF)</span><i></i></div>
        <div class="rtd2"><span>Breakfast</span><i></i></div>
        <div class="rtd3"><span><a href="javascript:sHC.ShowPrice('WSMA0511000015','WSMA140400017', 'BB','CL096-CL124');">2,174,005.00 IDR</a></span><i></i></div>
        <div class="rtd4"><span class="btn-rsht-rate" onclick="sHC.jumpToPaxdetail(this);" hotelcode="WSMA0511000015" suppliercode="CL096-CL124" droomcatg="WSMA140400017-BB" roomstatus="Y" id="bookWSMA0511000015WSMA140400017-BB-CL096-CL124"><img border="0" src="/b2b/images/result-hotels/btnAV-v3.gif"></span></div>
        <div class="clear"></div>
    </div>

    <div class="bd-rate-row">
        <div class="rtd1"><span class="rmname-rsht-rate">EXECUTIVE (NRF)</span><i></i></div>
        <div class="rtd2"><span>Breakfast</span><i></i></div>
        <div class="rtd3"><span><a href="javascript:sHC.ShowPrice('WSMA0511000015','WSMA140400018', 'BB','CL096-CL124');">2,505,557.00 IDR</a></span><i></i></div>
        <div class="rtd4"><span class="btn-rsht-rate" onclick="sHC.jumpToPaxdetail(this);" hotelcode="WSMA0511000015" suppliercode="CL096-CL124" droomcatg="WSMA140400018-BB" roomstatus="Y" id="bookWSMA0511000015WSMA140400018-BB-CL096-CL124"><img border="0" src="/b2b/images/result-hotels/btnAV-v3.gif"></span></div>
        <div class="clear"></div>
    </div>

    <div class="bd-rate-row">
        <div class="rtd1"><span class="rmname-rsht-rate">DELUXE</span><i></i></div>
        <div class="rtd2"><span>Room Only</span><i></i></div>
        <div class="rtd3"><span><a href="javascript:sHC.ShowPrice('WSMA0511000015','WSMA05110034', 'RO','CL096-CL124');">2,743,860.00 IDR</a></span><i></i></div>
        <div class="rtd4"><span class="btn-rsht-rate" onclick="sHC.jumpToPaxdetail(this);" hotelcode="WSMA0511000015" suppliercode="CL096-CL124" droomcatg="WSMA05110034-RO" roomstatus="Y" id="bookWSMA0511000015WSMA05110034-RO-CL096-CL124"><img border="0" src="/b2b/images/result-hotels/btnAV-v3.gif"></span></div>
        <div class="clear"></div>
    </div>

    <div class="bd-rate-row">
        <div class="rtd1"><span class="rmname-rsht-rate">DELUXE</span><i></i></div>
        <div class="rtd2"><span>Breakfast</span><i></i></div>
        <div class="rtd3"><span><a href="javascript:sHC.ShowPrice('WSMA0511000015','WSMA05110034', 'BB','CL096-CL124');">2,847,470.00 IDR</a></span><i></i></div>
        <div class="rtd4"><span class="btn-rsht-rate" onclick="sHC.jumpToPaxdetail(this);" hotelcode="WSMA0511000015" suppliercode="CL096-CL124" droomcatg="WSMA05110034-BB" roomstatus="Y" id="bookWSMA0511000015WSMA05110034-BB-CL096-CL124"><img border="0" src="/b2b/images/result-hotels/btnAV-v3.gif"></span></div>
        <div class="clear"></div>
    </div>
</span>
        </div>
        <div class="bd-rate-bottom">
            <div class="rsht-cxl" id="CancelPolicyWSMA0511000015CL096-CL124" onclick="sHC.rCancel('WSMA0511000015CL096-CL124','CL096-CL124');" title="Click for view cancellation policy">
                <textarea id="PolicyWSMA0511000015CL096-CL124" style="display:none"></textarea>
                Cancellation Policy         </div>
            <div class="rsht-promotion" id="FOCWSMA0511000015CL096-CL124" onmouseover="sHC.popupHotelFOC(this, 'WSMA0511000015CL096-CL124','CL096-CL124');" detail="" style="display:none;">(Special Promotion)</div>
            <div class="rsht-message" id="lyHotelMessageWSMA0511000015CL096-CL124"><blink><b><font color="red">Hotel Message</font> : </b></blink> <span id="HotelMessageWSMA0511000015CL096-CL124" style="word-wrap:break-word;">Complimentary WIFI internet access</span></div>
            <div class="clear"></div>
        </div>
    </div>

If there are another table structure like this, the evaluate function just return 1 or even 0... so whats the problem about this case?

This is what I've got if I did my jQuery dom counting in inspect element chrome browser

jQuery("#dResult > div:nth-child("+1+") > div > div:nth-child(4) > div:nth-child("+1+") > div:nth-child(1) > span > div").length;

8

jQuery("#dResult > div:nth-child("+1+") > div > div:nth-child(4) > div:nth-child("+2+") > div:nth-child(1) > span > div").length;

15

Your selectors look fine. There are multiple things that might have happened...

Waiting

The site is not fully loaded (dynamic site/SPA) which means that you're trying to access those elements too early. You could for example wait for the first elements to appear and then access all of them:

casper.waitFor(function check(){
    return this.evaluate(getNumber, 1) > 0;
}, function then(){
    var table_rows1 = this.evaluate(getNumber, 1);
    var table_rows2 = this.evaluate(getNumber, 2);
    ...
});

With XPath

PhantomJS has a bug with :nth-child() selectors which only appears for specific constellations. You can try to use XPath expressions for that.

function getNumber(it){
     var query = "//*[@id='dResult']/*["+1+"]/div/*[4]/*["+it+"]/*[1]/span/div";
     return __utils__.getElementsByXPath(query).length;
}

The reason I use *[x] instead of div[x] is because XPath takes the element name into account when querying according to a position, but CSS selectors do not.

Different mobile page

Sometimes the server delivers a different page according to the user agent string or the viewport. PhantomJS has a default viewport size of 400x300. Maybe the page was dynamically changed by the page JavaScript.

  • Check the screenshot ( casper.capture() ) whether the page looks the same way as in Chrome.
  • Dump the page source with casper.debugHTML() and compare it with the Chrome version.

Generic problems

There may be other problems with the page. Look for errors with the various event handlers. Please register to the resource.error , page.error , remote.message and casper.page.onResourceTimeout events ( Example ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM