简体   繁体   English

使用cUrl登录到ThinkStock

[英]Log into ThinkStock using cUrl

I'm wondering whether it is possible to cUrl into ThinkStock. 我想知道是否有可能进入ThinkStock。 My probing/research so far has yielded the following: 到目前为止,我的探测/研究得出以下结果:

  • The auth page https://secure.thinkstockphotos.com/Authentication/SignIn sets required cookies - Try clearing domain cookies after visiting that page, then log-in, it will keep you on that page. 身份验证页面https://secure.thinkstockphotos.com/Authentication/SignIn设置所需的Cookie-在访问该页面后尝试清除域cookie,然后登录,它将使您继续浏览该页面。

  • If I try and get those cookies in one request without sending login POST data, upon the next request it will ask for a CAPTCHA. 如果我尝试在一个请求中获取这些cookie,而不发送登录POST数据,则在下一个请求时,它将要求输入验证码。

I'm not sure what to do at this point, I have tried matching headers as best as possible, using referers, FF4 user agent, but it doesn't seem enough. 我现在不确定该怎么做,我已经尝试过使用FF4用户代理程序,尽可能地匹配标头,但这似乎还不够。 I was wondering if anybody has run into this before with ThinkStock. 我想知道在使用ThinkStock之前是否有人遇到过这种情况。

Disclaimer: This is not a scraper, it is to build up a database of images from our previous downloads for easier access to our agency's designers/not unnecessarily decreasing our download limit for that day. 免责声明:这不是一个小问题,它是建立我们以前下载的图像数据库,以便更轻松地访问我们代理商的设计人员/而不必减少当天的下载限制。

My code for initial connection: 我的初始连接代码:

function __construct(
    $username = THINKSTOCK_USERNAME, $password = THINKSTOCK_PASSWORD){

    // create the cookie file
    $this->cookieFile = tempnam('./cookies', '');
    //******** Connect to host and try and get cookies required for auth *******//
    // initialise the CURL connection
    $curlRequest = curl_init('https://secure.thinkstockphotos.com/Authentication/SignIn');

    // do not attempt to verify the SSL certificate
    curl_setopt($curlRequest, CURLOPT_SSL_VERIFYPEER, false);

    // set curl to follow up to two redirects
    curl_setopt($curlRequest, CURLOPT_FOLLOWLOCATION, true);
    // curl_setopt($curlRequest, CURLOPT_MAXREDIRS, 5);
    curl_setopt($curlRequest, CURLOPT_REFERER, 'https://secure.thinkstockphotos.com/Authentication/SignIn');
    // set curl to timeout after two seconds
    curl_setopt($curlRequest, CURLOPT_CONNECTTIMEOUT, 5);
    curl_setopt($curlRequest, CURLOPT_TIMEOUT, 5);

    // set the cookie file
    curl_setopt($curlRequest, CURLOPT_COOKIEJAR, $this->cookieFile);

    // set the POST data
    curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1');


    // do not output the returned data
    curl_setopt($curlRequest, CURLOPT_RETURNTRANSFER, true);

    // execute the curl request and close the connection
    $response = curl_exec($curlRequest);
    var_export($response);
    curl_close($curlRequest);


    //******** Connect to host with cookies and gogogo auth *******//
    // initialise the CURL connection
    $curlRequest = curl_init('https://secure.thinkstockphotos.com/Authentication/SignIn');

    // do not attempt to verify the SSL certificate
    curl_setopt($curlRequest, CURLOPT_SSL_VERIFYPEER, false);

    // set curl to follow up to two redirects
    curl_setopt($curlRequest, CURLOPT_FOLLOWLOCATION, true);
   // curl_setopt($curlRequest, CURLOPT_MAXREDIRS, 5);
    curl_setopt($curlRequest, CURLOPT_REFERER, 'https://secure.thinkstockphotos.com/Authentication/SignIn');
    // set curl to timeout after two seconds
    curl_setopt($curlRequest, CURLOPT_CONNECTTIMEOUT, 5);
    curl_setopt($curlRequest, CURLOPT_TIMEOUT, 5);

    // set the cookie file
    curl_setopt($curlRequest, CURLOPT_COOKIEFILE, $this->cookieFile);
    curl_setopt($curlRequest, CURLOPT_COOKIEJAR, $this->cookieFile);

    // set the POST data
    curl_setopt($curlRequest, CURLOPT_POST, true);
    curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1');


    // do not output the returned data
    curl_setopt($curlRequest, CURLOPT_RETURNTRANSFER, true);


    curl_setopt(
        $curlRequest,
        CURLOPT_POSTFIELDS,
        array(
          'userName' => $username,
          'password' => $password,
          'returnUrl' => '/',
          'SignInButton' => ''
        ));

    // execute the curl request and close the connection
    $response = curl_exec($curlRequest);
    curl_close($curlRequest);
    var_export($response);
    // if the log in attempt failed, throw an exception
    if (strpos($response, 'https://secure.thinkstockphotos.com/Authentication/SignIn') !== false){
      throw new Exception('Incorrect log-in details');
    }

}

ThinkStock's terms of service prohibit what you appear to be trying to do. ThinkStock的服务条款禁止您尝试执行的操作。 Even if you have rights to the images since you've paid to download them etc, they do not allow site access via automation. 即使您已经付费下载图像等,就拥有这些图像的权利,它们也不允许通过自动化进行网站访问。 I think you need to contact them for permission, and then ask if there is an alternative login page you can use. 我认为您需要联系他们以获得许可,然后询问是否可以使用其他登录页面。 Given the nature of what you want to do, they may relent (since you're not trying to rip off the site, just make it easier to access the content you have licensed) 鉴于您想做的事情的性质,它们可能会让人屈服(因为您不是要剥夺该网站的权限,只是可以更轻松地访问您许可的内容)

Use of the Site 网站的使用

This Site and the Thinkstock Content are intended for customers of Thinkstock. 本网站和Thinkstock内容仅供Thinkstock的客户使用。 You may not use this Site or the Thinkstock Content for any purpose not related to your business with Thinkstock. 您不得出于与您的Thinkstock业务无关的任何目的使用本网站或Thinkstock内容。 You are specifically prohibited from: (a) downloading, copying, or re-transmitting any or all of the Site or the Thinkstock Content without, or in violation of, a written license or agreement with Thinkstock; 明确禁止您: (a)未经或违反与Thinkstock的书面许可或协议,下载,复制或重新传输本网站或Thinkstock内容的任何或全部; (b) using any data mining, robots or similar data gathering or extraction methods; (b)使用任何数据挖掘,机器人或类似的数据收集或提取方法; (c) manipulating or otherwise displaying the Site or the Thinkstock Content by using framing or similar navigational technology; (c)通过使用框架或类似的导航技术来操纵或以其他方式显示网站或Thinkstock内容; (d) registering, subscribing, unsubscribing, or attempting to register, subscribe, or unsubscribe any party for any Thinkstock product or service if you are not expressly authorized by such party to do so; (d)注册,订阅,退订或试图注册,订阅或退订任何一方的任何Thinkstock产品或服务(如果您未得到该方的明确授权); and (e) using the Site or the Thinkstock Content other than for its intended purpose. (e)除出于预期目的使用网站或Thinkstock内容。 Such unauthorized use may also violate applicable laws including without limitation copyright and trademark laws, the laws of privacy and publicity, and applicable communications regulations and statutes. 这种未经授权的使用还可能违反适用的法律,包括但不限于版权和商标法,隐私和公开法以及适用的通信法规和法规。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM