简体   繁体   English

使用curl失败登录网站

[英]Log into website using curl failing

I am trying to login into to a remote site using curl. 我正在尝试使用curl登录到远程站点。 ( before doing some data scraping) (在进行某些数据抓取之前)

Using the following code I am producing a cookies.txt file that has the following: 使用以下代码,我正在生成一个具有以下内容的cookies.txt文件:

# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_www.xxx.com   FALSE   /   TRUE    0   xxxv5   h_r4hXtn-gNAilZwhvHjYdE3Vr4HewhxtGrxja57LbW03-M9MLNqZSeiW7lQ2wRT9lZypNsAiX0gS0Ev1PrvNkGLmwL3B8ZmyOUMLYbTYbSW0y_aPGrIFlEp4skDzh0GJGIGtFHisCmQjEMlu0CJr0UEw2rCT9jbjzg0IyOnFYxNffaMPo229NZWV7HDfCK5M1_y6MPNvW_Kt-h4qTy8YmqGbfBwKxB-bulV78MSXU9ZWz_DVvdu6jXfPiHwCBDMV8FFBLaXm5rqYgNzvbsq8JLe1xkTPn1PNJhyizUa-hlwB6ev8HNwIwBpzs7406l6mL3VgyrDJpay6bHNoMtjh4fLwI7KapFANhFHfn57mg4
#HttpOnly_www.xxx.com   FALSE   /   TRUE    0   ASP.NET_SessionId   txakhdi15oeqxyfq53f44dts

When I manually log into the web site the cookie names are correct. 当我手动登录网站时,cookie名称正确。 So I think I am creating the login ( otherwise the cookies would not be created) but when I output 所以我想我正在创建登录名(否则将不会创建cookie),但是当我输出时

echo 'HELLO html1 = '.$html1;

I see the page telling me I have entered the wrong username and password. 我看到页面告诉我输入了错误的用户名和密码。

Code as follows: 代码如下:

ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$username = 'xxx';
$password = 'xxx';
// echo 'STARTING';



//login form action url
$url="https://www.xxxx.com/Login"; 
$postinfo = "username=".$username."&password=".$password;

$cookie_file_path = "cookie.txt";

$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);

curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
//set the cookie the site has for certain features, this is optional
curl_setopt($ch, CURLOPT_COOKIE, "cookiename=0");
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS,5); // return into a variable
// curl_setopt($ch, CURLOPT_UPLOAD, true); 
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST" );
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);

// set content length
$headers[] = 'Content-length: 0';
$headers[] = 'Transfer-Encoding: chunked';
curl_setopt($ch, CURLOPT_HTTPHEADER , $headers);

$html1 = curl_exec($ch);
echo 'HELLO html1 = '.$html1;

I cannot show the site for security reasons. 出于安全原因,我无法显示该网站。 ( which may be a killer) (可能是杀手)

Can anyone point me in the right direction? 谁能指出我正确的方向?

first off, this won't work: ini_set('display_startup_errors', 1); 首先,这是行不通的: ini_set('display_startup_errors', 1); - the startup phase is already finished before the userland php code starts to run, so this setting is set too late. -在开始运行userland php代码之前,启动阶段已经完成,因此此设置设置为时已晚。 it must be set in the php.ini config file. 它必须在php.ini配置文件中设置。 (not strictly true, but close enough, like on windows you can do crazy registry hacks to enable it, and you can set it with .user.ini files, etc, more info here http://php.net/manual/en/configuration.php ) (并非完全正确,但足够接近,就像在Windows上一样,您可以通过疯狂的注册表黑客来启用它,并且可以使用.user.ini文件等进行设置,更多信息请参见http://php.net/manual/en /configuration.php

second, obvious error here is that you don't urlencode $username and $password in $postinfo = "username=".$username."&password=".$password; 第二个明显的错误是您没有在$postinfo = "username=".$username."&password=".$password; $ username和$ password进行urlencode $postinfo = "username=".$username."&password=".$password; - if the username OR password contains any characters with special meanings in urlencoded format, you'll send the wrong credentials and won't get logged in (this includes & , = , @ , spaces, and many other characters). -如果用户名或密码包含任何具有urlencoded格式的特殊含义的字符,则您将发送错误的凭据,并且将无法登录(这包括&=@ ,空格和许多其他字符)。 fixed version would look like $postinfo = "username=".urlencode($username)."&password=".urlencode($password); 固定版本看起来像$postinfo = "username=".urlencode($username)."&password=".urlencode($password);

third, don't use CURLOPT_CUSTOMREQUEST for POST requests, just use CURLOPT_POST. 第三,不要将CURLOPT_CUSTOMREQUEST用于POST请求,而只需使用CURLOPT_POST。

fourth, your Content-length header is outright lying. 第四,您的Content-length标头完全是在说谎。 the correct length is actually 'Content-length: '.strlen($postinfo) - which with your code, is definitely not 0 - but you shouldn't set this header at all, curl will do it for you if you don't, and unlike you, curl won't mess up the code calculating the size, so get rid of the entire line. 正确的长度实际上是'Content-length: '.strlen($postinfo) -在您的代码中,绝对不是0-但您根本不应该设置此标头,如果不这样做,curl会为您完成,而且与您不同的是,curl不会弄乱计算大小的代码,因此请删除整行。

fifth, this code is also wrong: $headers[] = 'Transfer-Encoding: chunked'; 第五,该代码也是错误的: $headers[] = 'Transfer-Encoding: chunked'; your curl code here is NOT using chuncked transfers, and if it were, curl would send that header automatically, so get rid of it. 您的curl代码不使用分块传输,如果使用了curl,curl会自动发送该标头,因此请摆脱它。

sixth, don't just call curl_setopt, if there's an error setting any of your options, curl_setopt will return bool(false), and you should watch out for such errors, use curl_error to extract the error message, and throw an exception, if such an error occur. 第六,不要只是调用curl_setopt,如果在设置任何选项时出错,curl_setopt将返回bool(false),您应该当心此类错误,使用curl_error提取错误消息并抛出异常,如果发生这样的错误。 - instead of what your code is doing right now, silently ignoring any curl_setopt errors. -而不是您的代码现在正在做什么,请静默忽略任何curl_setopt错误。 use something like function ecurl_setopt($ch,int $option, $value){if(!curl_setopt($ch,$option,$value)){throw new \\RuntimeException('curl_setopt failed!: '.curl_error($ch));}} 使用类似function ecurl_setopt($ch,int $option, $value){if(!curl_setopt($ch,$option,$value)){throw new \\RuntimeException('curl_setopt failed!: '.curl_error($ch));}}

if fixing all of these problems is not enough to log in, you're not giving us enough information to help you any further. 如果解决所有这些问题还不足以登录,则您没有向我们提供足够的信息来进一步帮助您。 what does the browsers http login request look like? 浏览器的http登录请求是什么样的? or what is the login url? 或登录网址是什么?

It is not as simple as reading the HTML page using curl. 它不像使用curl读取HTML页面那样简单。 You need to supply a POST value for the submit button. 您需要为“提交”按钮提供一个POST值。 If there is any javascript that executes prior to the activation of ACTION script, then that has to be looked at as well. 如果在激活ACTION脚本之前执行了任何javascript,那么也必须加以考虑。

Usually you get better results if you use Selenium. 通常,如果使用硒,您会获得更好的结果。 See http://www.seleniumhq.org/ 请参阅http://www.seleniumhq.org/

EDIT1: EDIT1:

If the server is rejecting your post string try: curl_setopt($handle, CURLOPT_POSTFIELDS, http_build_query($data)); 如果服务器拒绝您的发布字符串,请尝试: curl_setopt($handle, CURLOPT_POSTFIELDS, http_build_query($data));

ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$username = 'xxx';
$password = 'xxx';    
//login form action url
$url="https://www.xxxx.com/Login"; 
$postinfo = array("username"=>$username,"password"=>$password);
$cookie_file_path = "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_file_path);
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_file_path);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
$html = curl_exec($ch);
echo $html;

Above code must works fine. 上面的代码必须工作正常。 If there is still an issue, you must check cookie.txt file permissions. 如果仍然存在问题,则必须检查cookie.txt文件的权限。

Also if there is an invisible data needs to be sent including post, you can check it using firefox Live Http Headers plugin. 另外,如果需要发送包含帖子的不可见数据,则可以使用firefox Live Http Headers插件进行检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM