简体   繁体   English

获取“重定向到”页面

[英]Get the “redirected to” page

I am trying to parse a home page of a site, but it is accessible through redirecting from another page only, so I can only have the html of the redirecting page. 我正在尝试解析站点的主页,但是只能通过从另一个页面重定向来访问它,因此我只能使用重定向页面的html。

How can I get the html page of the "redirected to" page ? 如何获取“重定向到”页面的html页面?

the following is an example: I can get a page a.html, which when I open with browser it will redirect me to b.html, I want to parse b.html, but when I open b.html directly it will require POST parameters that can be sent from a.html to b.html when redirecting. 下面是一个示例:我可以得到一个a.html页面,当我用浏览器打开该页面时,它将重定向到b.html,我想解析b.html,但是当我直接打开b.html时,它将需要POST重定向时可以从a.html发送到b.html的参数。

Edit: just for note, the "redirected to" page is has a relative path, so I do the following: 编辑:只是为了说明,“重定向到”页面是具有相对路径,所以我执行以下操作:

$pos=strpos($result,"window.location = \"");
$res= substr_replace ($result,"https://thecompletepath/",$pos,0);
echo $res;

and the redirecting is through a javascript code, as following: 并通过javascript代码进行重定向,如下所示:

<script type="text/javascript" charset="utf-8">
    escapeIfModal();
    LoadingScreen.start();
    window.location = "/home";
</script>

You can use cURL to follow redirects as the browser would. 您可以像浏览器一样使用cURL跟随重定向。

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "a.html");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch); //response $a would contain the last redirected location: "b.html"

using file_get_contents: 使用file_get_contents:

$context = stream_context_create(
    array(
        'http' => array(
            'follow_location' => true
        )
    )
);

$html = file_get_contents('http://www.example.com/a.html', false, $context);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM