[英]HTTrack possible using cookies
I want to download the page from a URL, easy enough.我想从 URL 下载页面,很简单。 But on the first page I have to login, as I normally do from a normal browser.
但是在第一页我必须登录,就像我通常从普通浏览器一样。 But HTTrack is downloading from the first page since it can't use my cookies or login.
但是 HTTrack 正在从第一页下载,因为它无法使用我的 cookie 或登录。
Is it any way for me to get around this?我有什么办法解决这个问题吗?
This question was asked in 2013 so I don't know if Httrack was supporting cookies back then, (I guess not) but now it definitely does.这个问题是在 2013 年提出的,所以我不知道当时 Httrack 是否支持 cookie,(我猜不是)但现在确实支持。
Instructions:说明:
Example for a cookie.txt: cookie.txt 的示例:
www.httrack.com TRUE / FALSE 1999999999 foo bar
www.example.com TRUE /folder FALSE 1999999999 JSESSIONID xxx1234
www.example.com TRUE /hello FALSE 1999999999 JSESSIONID yyy1234
Reference: http://httrack.kauler.com/help/Cookies参考: http : //httrack.kauler.com/help/Cookies
我尝试使用 Windows 版本让 cookies.txt 与我的网站一起工作,但不能,我只是将它添加为标题,然后就可以了。
Try using cURL in PHP:尝试在 PHP 中使用 cURL:
http://php.net/manual/en/book.curl.php http://php.net/manual/en/book.curl.php
There are wrappers for this, like:有用于此的包装器,例如:
http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading
Use options such as:使用选项,例如:
Download the class from:从以下位置下载课程:
http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading
require_once( 'CURL.php' ); //Change this to whatever that class is called in the above
$curl = new CURL();
$curl->retry = 2;
$opts = array(
CURLOPT_USERAGENT => 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20091020 Linux Mint/8 (Helena) Firefox/3.5.3',
CURLOPT_COOKIEFILE => 'fb.tmp',
CURLOPT_COOKIEJAR => 'fb.tmp',
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_SSL_VERIFYHOST => 0,
CURLOPT_SSL_VERIFYPEER => 0,
CURLOPT_TIMEOUT => 20
);
$post_data = array( ); //put your login POST data here
$opts[CURLOPT_POSTFIELDS] = http_build_query( $post_data );
$curl->addSession( 'https://www.facebook.com/messages', $opts );
$result = $curl->exec();
$curl->clear();
print_r( $result );
Note, that sometimes you need to load a page first, to set a cookie, before they will let you login.请注意,有时您需要先加载页面,设置 cookie,然后他们才会让您登录。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.