简体   繁体   English

PHP - 管理 curl 输出

[英]PHP - manage curl output

based on my last question, i sent request to website and it show me output.根据我的最后一个问题,我向网站发送了请求,它显示了我的输出。 But, output show me the full website.但是,输出显示了完整的网站。 i want get only some data like link in curl output.我只想在 curl 输出中获取一些数据,例如链接

$url = 'http://site1.com/index.php';
$data = ["send" => "Test"];
$ch = curl_init($url);

curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);
var_dump($response);

this code show me what i want , but the output contain full website.这段代码告诉我我想要什么,但输出包含完整的网站。 i just want get some data and show in out put.我只想获取一些数据并显示在输出中。

You can use preg_match_all and a carefully constructed pattern.您可以使用preg_match_all和精心构造的模式。 This modified version of your code should give you a list of all the image urls in the HTML that you retrieve:您的代码的这个修改版本应该为您提供您检索的 HTML 中所有图像 URL 的列表:

        $url = 'http://site1.com/index.php';
        $data = ["send" => "Test"];
        $ch = curl_init($url);

        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

        $response = curl_exec($ch);
        curl_close($ch);


        $matches = NULL;
        $pattern = '/<img[^>]+src=\"([^"]+)"[^>]*>/';
        $img_count = preg_match_all($pattern, $response, $matches);

        var_dump($matches[1]);

If you'd like to fetch all the links instead, you can change $pattern to this:如果您想获取所有链接,可以将$pattern更改$pattern

        $pattern = '/<a[^>]+href=\"([^"]+)"[^>]*>/';

I have tested this code on an html file that looks like this:我已经在一个如下所示的 html 文件上测试了这段代码:

<html>
<body>
<div><img src="WANT-THIS"></div>
</body>
</html>

And the output is this:输出是这样的:

array(1) {
  [0]=>
  string(9) "WANT-THIS"
}

EDIT 2: In response to additional questions from the OP, I have also tried the script on this html file:编辑 2:为了回答 OP 的其他问题,我还尝试了此 html 文件上的脚本:

<html>
<body>
<div1>CODE</div><div2>CODE</div><div3>CODE</div><div4>CODE</div><div5>CODE</div><div6>CODE</div><img src="IMAGE">
</body>
</html>

And it produces this result:它产生了这个结果:

array(1) {
  [0]=>
  string(5) "IMAGE"
}

If this doesn't solve your problem, you'll need to provide additional detail -- either an example url that you are fetching, some HTML that you want to search, or extra detail about how you might know which image in the HTML you want to grab -- does it have some special id?如果这不能解决您的问题,您将需要提供其他详细信息——您正在获取的示例 url、您要搜索的一些 HTML,或者有关您如何知道 HTML 中的哪个图像的额外详细信息想抢——它有什么特殊的ID吗? Is it always the first image?它总是第一个图像吗? The second image?第二张图? Is there any characteristic by which we know which image to grab?是否有任何特征可以让我们知道要抓取哪个图像?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM