繁体   English   中英

如何为CURLOPT_RETURNTRANSFER实施变通办法

[英]How to Implement Workaround for CURLOPT_RETURNTRANSFER

我正在使用此代码http://martinsikora.com/how-to-steal-google-s-did-you-mean-feature对我的searh做“您的意思是您的意思”,但是我的托管服务提供商设置了open_basedir且不会出现leyt我改变。 我已经看到了几种解决方法,但是我不知道如何将这些实现到他的代码中。

以下是代码段:

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agents[rand(0, count($agents) - 1)]);
$data = curl_exec($ch);
curl_close($ch);

多么怪异而烦人的(基本上是无证的)限制,尤其是在可以很容易地解决时。 您所需要做的就是检查3xx响应代码,然后检查Location:标头的内容以找到您要重定向到的URL。

这不像我喜欢的那么琐碎,因为许多应用程序违反了RFC,并且不使用完整的URL作为位置标头中的数据-因此,您需要做一些伪造才能获得正确的权限位置。

这样的事情应该适用于您的代码(未经测试):

function make_url_from_location ($oldUrl, $locationHeader) {
  // Takes a URL and a location header and calculates the new URL
  // This takes relative paths (which are non-RFC compliant) into
  // account, which most browsers will do. Requires $oldUrl to be
  // a full URL

  // First check if $locationHeader is a full URL
  $newParts = parse_url($locationHeader);
  if (!empty($newParts['scheme'])) {
    return $locationHeader;
  }

  // We need a path at a minimum. If not, return the old URL.
  if (empty($newParts['path'])) {
    return $oldUrl;
  }

  // Construct the start of the new URL
  $oldParts = parse_url($oldUrl);
  $newUrl = $oldParts['scheme'].'://'.$oldParts['host'];
  if (!empty($oldParts['port'])) {
    $newUrl .= ':'.$oldParts['port'];
  }

  // Build new path
  if ($newParts['path'][0] == '/') {
    $newUrl .= $newParts['path'];
  } else {
    // str_replace() to work around (buggy?) Windows behaviour where one level
    // paths cause dirname to return a \ instead of a /
    $newUrl .= str_replace('\\', '/', dirname($oldParts['path'])).$newParts['path'];
  }

  // Add a query string
  if (!empty($newParts['query'])) {
    $newUrl .= '?'.$newParts['query'];
  }

  return $newUrl;

}

$maxRedirects = 30;

$redirectCount = 0;
$complete = FALSE;

// Get user agent string once at start - array_rand() is tidier
// For these purposes, a single static string will probably be fine
$userAgent = $agents[array_rand($agents)];

do {

  // Make the request
  $ch = curl_init($url);
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
  curl_setopt($ch, CURLOPT_TIMEOUT, 10);
  curl_setopt($ch, CURLOPT_HEADER, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_USERAGENT, $userAgent]);
  $data = curl_exec($ch);

  // Get the response code (easier than parsing it from the headers)
  $responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

  // Split header from body
  $data = explode("\r\n\r\n", $data, 2);
  $header = $data[0];
  $data = $data[1];

  // Check for redirect response codes
  if ($responseCode >= 300 && $responseCode < 400) {

    if (!preg_match('/^location:\s*(.+?)$/mi', $header, $matches)) {
      // This is an error. If you get here the response was a 3xx code and
      // no location header was set. You need to handle that error here.
      $complete = TRUE;
    }

    // Get URL for next iteration
    $url = make_url_from_location(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), trim($matches[1]));

  } else {

    // Non redirect response code (might still be an error code though!)
    $complete = TRUE;

  }

// Loop until no more redirects or $maxRedirects is reached
} while (!$complete && ++$redirectCount < $maxRedirects);

// Perform whatever error checking is necessary here

// Close the cURL handle
curl_close($ch);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM