简体   繁体   English

PHP cURL重定向到localhost

[英]PHP cURL redirects to localhost

I'm trying to login to an external webpage using a php script with cURL. 我正在尝试使用带有cURL的php脚本登录外部网页。 I'm new to cURL, so I feel like I'm missing a lot of pieces. 我是cURL的新手,所以我觉得我错过了很多东西。 I found a few examples and modified them to allow access to https pages. 我找到了一些示例并对其进行了修改以允许访问https页面。 Ultimately, my goal is to be able to login to the page and download a .csv by following a specified link once logged in. So far, what I have is a script that tests logging in to the page; 最终,我的目标是能够登录页面并在登录后通过指定的链接下载.csv。到目前为止,我所拥有的是一个测试登录页面的脚本; the script is shown below: 脚本如下所示:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.websiteurl.com/login');
curl_setopt($ch, CURLOPT_POSTFIELDS,'Email='.urlencode($login_email).'&Password='.urlencode($login_pass).'&submit=1');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_setopt($ch, CURLOPT_REFERER, "https://www.websiteurl.com/login");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$output = curl_exec($ch);

I have a few questions. 我有几个问题。 First, is there a reason this does not redirect on its own? 首先,有没有一个原因,它不会自行重定向? The only way for me to view the contents of the page is to 我查看页面内容的唯一方法是

echo $output

even though CURLOPT_RETURNTRANSFER and CURLOPT_FOLLOWLOCATION are both set to True. 即使CURLOPT_RETURNTRANSFER和CURLOPT_FOLLOWLOCATION都设置为True。

Second, the URL for the page stays at "localhost/folderName/test.php" instead of directing to the actual website. 其次,页面的URL保留在“localhost / folderName / test.php”,而不是指向实际的网站。 Can anyone explain why this happens? 谁能解释为什么会这样? Because the script doesn't actually redirect to a logged in webpage, I can't seem to do anything that I need to do. 因为脚本实际上没有重定向到登录的网页,所以我似乎无法做任何我需要做的事情。

Does my issue have to do with cookies? 我的问题与cookies有关吗? My cookies.txt file is in the same folder that my .php script is. 我的cookies.txt文件与我的.php脚本位于同一个文件夹中。 (I'm using wampServer btw). (我正在使用wampServer btw)。 Should it be located elsewhere? 它应该位于其他地方吗?

Once I'm able to fix these two issues, it seems that all I need to be able to do is to redirect to the link that start the download process for the .csv file. 一旦我能够解决这两个问题,似乎我需要做的就是重定向到启动.csv文件下载过程的链接。

Thanks for any help, much appreciated! 感谢您的帮助,非常感谢!

Answering part of your question: 回答你问题的一部分:

From http://php.net/manual/en/function.curl-setopt.php : 来自http://php.net/manual/en/function.curl-setopt.php

CURLOPT_RETURNTRANSFER TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly. CURLOPT_RETURNTRANSFER TRUE将传输作为curl_exec()的返回值的字符串返回,而不是直接输出。

In other words - doing exactly what you described. 换句话说 - 正是你所描述的。 It's returning the response to a string and you echo it to see it. 它将响应返回给字符串,然后echo它以查看它。 As requested... 按照要求...

----- EDIT----- -----编辑-----

As for the second part of your question - when I change the last three lines of the script to 至于问题的第二部分 - 当我将脚本的最后三行更改为

$output = curl_exec($ch);
header('Location:'.$website);
echo $output;

The address of the page as displayed changes to $website - which in my case is the variable I use to store my equivalent of your ' https://www.websiteurl.com/login ' 显示的页面地址更改为$website - 在我的情况下是我用来存储我的等效“ https://www.websiteurl.com/login ”的变量

I am not sure that is what you wanted to do - because I'm not sure I understand what your next steps are. 我不确定这是你想做什么 - 因为我不确定我明白你接下来的步骤是什么。 If you were getting redirected by the login site, wouldn't the new address be part of the header that is returned? 如果您被登录站点重定向,新地址不会成为返回标头的一部分吗? And wouldn't you need to extract that address in order to perform the next request ( wget or whatever) in order to download the file you wanted to get? 您是否需要提取该地址才能执行下一个请求( wget或其他)以下载您想要获取的文件?

To do so, you need to set CURLOPT_HEADER to TRUE, 为此,您需要将CURLOPT_HEADER设置为TRUE,

You can get the URL where you ended up from 您可以获取最终的URL

$last_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL); 

(see cURL , get redirect url to a variable ). (参见cURL,获取重定向url到变量 )。

The same link also has a useful script for completely parsing the header information (returned when CURLOPT_HEADER==true . It's in the answer by nico limpica. 相同的链接还有一个有用的脚本,用于完全解析标题信息(当CURLOPT_HEADER==true时返回。它在nico limpica的答案中。

Bottom line: CURL gets the information that your browser would have received if you had pointed it to a particular site; 结论:如果您将浏览器指向某个特定网站,则CURL会获取您的浏览器所收到的信息。 that doesn't mean your browser behaves as though you pointed it to that site... 这并不意味着您的浏览器表现得就像您将其指向该网站一样......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM