Facebook使用cURL和PHP登录

问题描述 投票:-1回答:1

我想用curl到达facebook登录页面。我的目的是登录facebook,然后做一些scaping。由于最新的限制,我没有使用facebook API ...我需要在帖子上抓取评论,这是不可能的,只使用API​​。

这是我的一些代码:

curl_setopt($ch, CURLOPT_URL,"https://web.facebook.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$response = curl_exec($ch);
curl_close($ch);
echo $response;

我希望这会重定向到登录页面,然后当用户填写登录表单时,我会获取凭据并使用它们重定向到主页并开始抓取。

无论如何,这是我得到的:

php facebook curl web-crawler screen-scraping
1个回答
3
投票

(ps,我是该程序的作者)this program登录到facebook发送消息。登录代码可以找到here,登录程序在构造函数中完成,

但它的要点是你需要先做一个GET请求来获取一个cookie和一个csrf令牌和一些东西,从lgoin表单中解析出来,然后将其发送回application/x-www-form-urlencoded POST请求以及用户名和密码到您的cookie会话特定的登录URL,您必须解析其在第一个GET请求中收到的html的URL。

使用一个暗示你有糟糕的javascript支持的用户代理也是符合你的最佳利益的(因为实际上,使用PHP,你没有。),这个代码使用的一个例子是'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+'(又名,一个老黑莓手机)

  • 现在,如果您使用智能手机用户代理,有时可能会要求您安装智能手机应用程序,如果您遇到该问题,在您回答“是”或“否”之前,它将无法完成登录,因此您需要添加用于检测该问题的代码,如果存在则回答它,您可以使用XPath "//a[contains(@href,'/login/save-device/cancel/')]"和protip检测该问题,这是确认您设法登录的一种很好的方法,即查找注销按钮,在XPath中看起来像//a[contains(@href,"/logout.php")]

代码中最相关的部分是:

function __construct() {
    $this->recipientID = \MsgMe\getUserOption ( 'Facebook', 'recipientID', NULL );
    if (NULL === $this->recipientID) {
        throw new \Exception ( 'Error: cannot find [Facebook] recipientID option!' );
    }
    $this->email = \MsgMe\getUserOption ( 'Facebook', 'email', NULL );
    if (NULL === $this->email) {
        throw new \Exception ( 'Error: cannot find [Facebook] email option!' );
    }
    $this->password = \MsgMe\getUserOption ( 'Facebook', 'password', NULL );
    if (NULL === $this->password) {
        throw new \Exception ( 'Error: cannot find [Facebook] password option!' );
    }
    $this->hc = new \hhb_curl ();
    $hc = &$this->hc;
    $hc->_setComfortableOptions ();
    $hc->setopt_array ( array (
            CURLOPT_USERAGENT => 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+',
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    $hc->exec ( 'https://m.facebook.com/' );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();
    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );    
    $form = (\MsgMe\tools\getDOMDocumentFormInputs ( $domd, true )) ['login_form'];
    $url = $domd->getElementsByTagName ( "form" )->item ( 0 )->getAttribute ( "action" );
    $postfields = (function () use (&$form): array {
        $ret = array ();
        foreach ( $form as $input ) {
            $ret [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
        }
        return $ret;
    });
    $postfields = $postfields (); // sorry about that, eclipse can't handle IIFE syntax.
    assert ( array_key_exists ( 'email', $postfields ) );
    assert ( array_key_exists ( 'pass', $postfields ) );
    $postfields ['email'] = $this->email;
    $postfields ['pass'] = $this->password;
    $hc->setopt_array ( array (
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => http_build_query ( $postfields ),
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    // \hhb_var_dump ($postfields ) & die ();
    $hc->exec ( $url );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();

    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
    $xp = new \DOMXPath ( $domd );
    $InstallFacebookAppRequest = $xp->query ( "//a[contains(@href,'/login/save-device/cancel/')]" );
    if ($InstallFacebookAppRequest->length > 0) {
        // not all accounts get this, but some do, not sure why, anyway, if this exist, fb is asking "ey wanna install the fb app instead of using the website?"
        // and won't let you proceed further until you say yes or no. so we say no.
        $url = 'https://m.facebook.com' . $InstallFacebookAppRequest->item ( 0 )->getAttribute ( "href" );
        $hc->exec ( $url );
        $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
        $xp = new \DOMXPath ( $domd );
    }
    unset ( $InstallFacebookAppRequest, $url );
    $urlinfo = parse_url ( $hc->getinfo ( CURLINFO_EFFECTIVE_URL ) );
    $a = $xp->query ( '//a[contains(@href,"/logout.php")]' );
    if ($a->length < 1) {
        $debuginfo = $hc->getStdErr () . $hc->getStdOut ();
        $tmp = tmpfile ();
        fwrite ( $tmp, $debuginfo );
        $debuginfourl = shell_exec ( "cat " . escapeshellarg ( stream_get_meta_data ( $tmp ) ['uri'] ) . " | pastebinit" );
        fclose ( $tmp );
        throw new \RuntimeException ( 'failed to login to facebook! apparently... cannot find the logout url!  debuginfo url: ' . $debuginfourl );
    }
    $a = $a->item ( 0 );
    $url = $urlinfo ['scheme'] . '://' . $urlinfo ['host'] . $a->getAttribute ( "href" );
    $this->logoutUrl = $url;
    // all initialized, ready to sendMessage();
}
© www.soinside.com 2019 - 2024. All rights reserved.