简体   繁体   English

HTTrack 可以使用 cookie

[英]HTTrack possible using cookies

I want to download the page from a URL, easy enough.我想从 URL 下载页面,很简单。 But on the first page I have to login, as I normally do from a normal browser.但是在第一页我必须登录,就像我通常从普通浏览器一样。 But HTTrack is downloading from the first page since it can't use my cookies or login.但是 HTTrack 正在从第一页下载,因为它无法使用我的 cookie 或登录。

Is it any way for me to get around this?我有什么办法解决这个问题吗?

This question was asked in 2013 so I don't know if Httrack was supporting cookies back then, (I guess not) but now it definitely does.这个问题是在 2013 年提出的,所以我不知道当时 Httrack 是否支持 cookie,(我猜不是)但现在确实支持。

Instructions:说明:

  1. Login to your website using Firefox or Chrome, then look at the login cookie.使用 Firefox 或 Chrome 登录您的网站,然后查看登录 cookie。
  2. Inside the Httrack folder where you are downloading your website, there should be a file named cookies.txt, if not, create one.在您下载网站的 Httrack 文件夹中,应该有一个名为 cookies.txt 的文件,如果没有,请创建一个。
  3. Copy the cookie information from your browser to this file.将 cookie 信息从您的浏览器复制到此文件。 You might also have to copy your useragent from your browser to the Httrack config.您可能还需要将您的用户代理从浏览器复制到 Httrack 配置。

Example for a cookie.txt: cookie.txt 的示例:

www.httrack.com TRUE    /       FALSE   1999999999  foo bar
www.example.com TRUE    /folder FALSE   1999999999  JSESSIONID  xxx1234
www.example.com TRUE    /hello  FALSE   1999999999  JSESSIONID  yyy1234

Reference: http://httrack.kauler.com/help/Cookies参考: http : //httrack.kauler.com/help/Cookies

我尝试使用 Windows 版本让 cookies.txt 与我的网站一起工作,但不能,我只是将它添加为标题,然后就可以了。

Try using cURL in PHP:尝试在 PHP 中使用 cURL:

http://php.net/manual/en/book.curl.php http://php.net/manual/en/book.curl.php

There are wrappers for this, like:有用于此的包装器,例如:

http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading

Use options such as:使用选项,例如:

EDIT: More specific, not tested编辑:更具体,未测试

Download the class from:从以下位置下载课程:

http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading

require_once( 'CURL.php' ); //Change this to whatever that class is called in the above
$curl = new CURL();  
$curl->retry = 2;  
    $opts = array(
    CURLOPT_USERAGENT => 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20091020 Linux Mint/8 (Helena) Firefox/3.5.3',
    CURLOPT_COOKIEFILE  => 'fb.tmp',
    CURLOPT_COOKIEJAR   => 'fb.tmp',
    CURLOPT_FOLLOWLOCATION  => 1,
    CURLOPT_RETURNTRANSFER  => 1,
    CURLOPT_SSL_VERIFYHOST  => 0,
    CURLOPT_SSL_VERIFYPEER  => 0,
    CURLOPT_TIMEOUT     => 20
);
$post_data = array(  ); //put your login POST data here
$opts[CURLOPT_POSTFIELDS] = http_build_query( $post_data );
$curl->addSession( 'https://www.facebook.com/messages', $opts );  
$result = $curl->exec();  
$curl->clear();
print_r( $result );

Note, that sometimes you need to load a page first, to set a cookie, before they will let you login.请注意,有时您需要先加载页面,设置 cookie,然后他们才会让您登录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM