简体   繁体   中英

Auto click button and web scraping

I try to get some data from a webpage and to put it in an array (php or javascript) or a database.

The link of the page is: https://pilotweb.nas.faa.gov/PilotWeb/

My problem is that i want the system itself to push the "I agree" button and after that to fill in the word "LGGG" in the locations field. Then push "view notams" to get the results.

From the results I need to retrieve the names in bold and some coordinates.

I tried instructions from the link " Auto-click button element on page load using jQuery " but it didn't work.

Any advice would be helpful!

I found the answer to my question and i posted it in case someone else faces the same problem.

All you need is the client PHP URL from page: http://php.net/manual/en/book.curl.php

The code for my case is:

 define("CURL_AGENT", "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7"); $c = curl_init(); curl_setopt_array($c, array( CURLOPT_HEADER => 0, CURLOPT_RETURNTRANSFER => 1, CURLOPT_FOLLOWLOCATION => 1, CURLOPT_USERAGENT => CURL_AGENT, CURLOPT_SSL_VERIFYHOST => 0, CURLOPT_SSL_VERIFYPEER => 0, CURLOPT_COOKIEFILE => 'NULL', CURLOPT_COOKIEJAR => 'NULL', CURLOPT_FOLLOWLOCATION => 1, //the link i want to retrieve data CURLOPT_URL => 'https://pilotweb.nas.faa.gov/PilotWeb/', CURLOPT_COOKIESESSION => 1 )); $resp = curl_exec($c); //from source code i get what I need $post = 'formatType=DOMESTIC&retrieveLocId=LGGG&reportType=REPORT&openItems=icaosHeader%2Cicaos%3AicaoHead%2Cicao%3ArightNavSec0%2CrightNavSecBorder0%3A&actionType=notamRetrievalByICAOs&submit=View+NOTAMs'; curl_setopt_array($c, array( CURLOPT_POST => 1, //the link of target page CURLOPT_URL => 'https://pilotweb.nas.faa.gov/PilotWeb/notamRetrievalByICAOAction.do?method=displayByICAOs', CURLOPT_POSTFIELDS => $post )); $resp = curl_exec($c); curl_close($c);

The above code creates an HTML file with the results of the page.

If you can use jQuery and inject code in the page and the code could be jQuery, the lines of code you need are like:

   $(function () {
        // on ready check the href
        if (window.location.href == 'https://pilotweb.nas.faa.gov/PilotWeb/') {
            // if cookie does not exist --> push I Agree 
            if (document.cookie.match(/^.*PILOTWEB_DISCLAIMER=true$/) === null) {
                $("button:contains('I Agree')").trigger('click');
            }
            // set text
            $('textarea[name="retrieveLocId"]').text('LGGG');
            // submit form
            $('form[action="/PilotWeb/notamRetrievalByICAOAction.do?method=displayByICAOs"] input[value="View NOTAMs"]').trigger('click');
        } else {
            // on second page get the results
            if (window.location.href == 'https://pilotweb.nas.faa.gov/PilotWeb/notamRetrievalByICAOAction.do?method=displayByICAOs') {
                $('[id="notamRight"] span strong').each(function(index, element) {
                    var boldValue = $(this).text();
                    // save it
                });
            }
        }
    });

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM