简体   繁体   中英

Facebook Login using cURL and PHP

I am trying to reach facebook login page using curl. My intention is to login to facebook, then do some scaping. I am not using the facebook API because of the latest restrictions... I need to scrape comments on posts and this is impossible by only using the API.

Here is some of my code:

curl_setopt($ch, CURLOPT_URL,"https://web.facebook.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$response = curl_exec($ch);
curl_close($ch);
echo $response;

I would want this to redirect to the login page, then when the user fills in login form I would take the credentials and use them to redirect to the homepage and start scraping.

Anyway, this is what I am getting:

我实际上是在chrome上运行的...

(ps, i am the author of the program) this program logs into facebook to send messages. the login code can be found here , the login procedure is done in the constructor function,

but the gist of it is that you need to do a GET request first to get a cookie and a csrf token and some stuff, parse that out of the lgoin form, then post that back in a application/x-www-form-urlencoded POST request together with the username and password to a login url specific to your cookie session, the url of which you must also parse out of the html received in the first GET request.

it'd also be in your best interest to use a user-agent implying that you have shitty javascript support (because in reality, with PHP, you have none.), an example, as used by that code, is 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+' (aka, an old blackberry phone)

  • now, if you use a smartphone user-agent, it may sometimes ask you to install an smartphone app, and if you get that question, it will not let you finish logging in until you answer either yes or no, so you need to add code to detect that question, and answer it if present, you can detect that question with the XPath "//a[contains(@href,'/login/save-device/cancel/')]" and protip, a nice way to confirm that you managed to log in, is to look for the logout button, which in XPath looks like //a[contains(@href,"/logout.php")]

the most relevant part of the code is:

function __construct() {
    $this->recipientID = \MsgMe\getUserOption ( 'Facebook', 'recipientID', NULL );
    if (NULL === $this->recipientID) {
        throw new \Exception ( 'Error: cannot find [Facebook] recipientID option!' );
    }
    $this->email = \MsgMe\getUserOption ( 'Facebook', 'email', NULL );
    if (NULL === $this->email) {
        throw new \Exception ( 'Error: cannot find [Facebook] email option!' );
    }
    $this->password = \MsgMe\getUserOption ( 'Facebook', 'password', NULL );
    if (NULL === $this->password) {
        throw new \Exception ( 'Error: cannot find [Facebook] password option!' );
    }
    $this->hc = new \hhb_curl ();
    $hc = &$this->hc;
    $hc->_setComfortableOptions ();
    $hc->setopt_array ( array (
            CURLOPT_USERAGENT => 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+',
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    $hc->exec ( 'https://m.facebook.com/' );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();
    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );    
    $form = (\MsgMe\tools\getDOMDocumentFormInputs ( $domd, true )) ['login_form'];
    $url = $domd->getElementsByTagName ( "form" )->item ( 0 )->getAttribute ( "action" );
    $postfields = (function () use (&$form): array {
        $ret = array ();
        foreach ( $form as $input ) {
            $ret [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
        }
        return $ret;
    });
    $postfields = $postfields (); // sorry about that, eclipse can't handle IIFE syntax.
    assert ( array_key_exists ( 'email', $postfields ) );
    assert ( array_key_exists ( 'pass', $postfields ) );
    $postfields ['email'] = $this->email;
    $postfields ['pass'] = $this->password;
    $hc->setopt_array ( array (
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => http_build_query ( $postfields ),
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    // \hhb_var_dump ($postfields ) & die ();
    $hc->exec ( $url );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();

    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
    $xp = new \DOMXPath ( $domd );
    $InstallFacebookAppRequest = $xp->query ( "//a[contains(@href,'/login/save-device/cancel/')]" );
    if ($InstallFacebookAppRequest->length > 0) {
        // not all accounts get this, but some do, not sure why, anyway, if this exist, fb is asking "ey wanna install the fb app instead of using the website?"
        // and won't let you proceed further until you say yes or no. so we say no.
        $url = 'https://m.facebook.com' . $InstallFacebookAppRequest->item ( 0 )->getAttribute ( "href" );
        $hc->exec ( $url );
        $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
        $xp = new \DOMXPath ( $domd );
    }
    unset ( $InstallFacebookAppRequest, $url );
    $urlinfo = parse_url ( $hc->getinfo ( CURLINFO_EFFECTIVE_URL ) );
    $a = $xp->query ( '//a[contains(@href,"/logout.php")]' );
    if ($a->length < 1) {
        $debuginfo = $hc->getStdErr () . $hc->getStdOut ();
        $tmp = tmpfile ();
        fwrite ( $tmp, $debuginfo );
        $debuginfourl = shell_exec ( "cat " . escapeshellarg ( stream_get_meta_data ( $tmp ) ['uri'] ) . " | pastebinit" );
        fclose ( $tmp );
        throw new \RuntimeException ( 'failed to login to facebook! apparently... cannot find the logout url!  debuginfo url: ' . $debuginfourl );
    }
    $a = $a->item ( 0 );
    $url = $urlinfo ['scheme'] . '://' . $urlinfo ['host'] . $a->getAttribute ( "href" );
    $this->logoutUrl = $url;
    // all initialized, ready to sendMessage();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM