简体   繁体   中英

how to submit and retrieve data with Curl?

I am trying Curl in PHP for the first time, the reason is I want to scrape results from this page : http://www.lldj.com/pastresult.php . This site posts weekly lotto results since 2002 and has a simple submit form ( Date ).

A submit button : Name = Button / value = Submit Select drop down : Name = Draw & Options #( 1 - 1097 ) // Represent draw number

I can go over it manually but i thought why don't i use a simple script and make it easier as I am also interested in testing how to submit data using PHP/ CURL and retrieve results.

I have used DOM PHP for scraping and I am comfortable using the syntax . I wonder if I should use Curl and DOM together or this can be achieved with CURL.

What I have so far ;

include'dom.php';
$post_data['draw'] = '1097';
$post_data['button'] = 'Submit';

//traverse array and prepare data for posting (key1=value1)
foreach ( $post_data as $key => $value) {
$post_items[] = $key . '=' . $value;
}

//create the final string to be posted using implode()
$post_string = implode ('&', $post_items);

//create cURL connection
$curl_connection = 
curl_init('http://www.lldj.com/pastresult.php');

//set options
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl_connection, CURLOPT_USERAGENT, 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 1);
//set data to be posted
curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string);

 //perform our request
$result = curl_exec($curl_connection);

 //show information regarding the request
 print_r(curl_getinfo($curl_connection));
echo curl_errno($curl_connection) . '-' . 
            curl_error($curl_connection);

After Submitting Data / scrape

$t = $curl_connection->find('table',0); // ?? usualy referes to file_get_content Var
$data = $t->find('tr');

foreach($data as $n) {
$tds = $n->find('td');

$dataRows = array();

$dataRows['num'] =  $tds[0]->find('img',0)->href;

var_dump($dataRows);
}

Can someone point on whether this is correct ? How can you set to automatically increase the submit value then repeat the process ( eg, submit darw = 1 then draw =2 ect. ) Thanks

<?php   
  while(true){

   for($i=1;$i<5000;$i++){

$post_data['draw'] = $i; // will change every time like 1,2,3,4
$post_data['button'] = 'Submit';

//traverse array and prepare data for posting (key1=value1)
foreach ( $post_data as $key => $value) {
$post_items[] = $key . '=' . $value;
}

//create the final string to be posted using implode()
$post_string = implode ('&', $post_items);

//create cURL connection
$curl_connection = 
curl_init('http://www.lldj.com/pastresult.php');

//set options
curl_setopt($curl_connection, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl_connection, CURLOPT_USERAGENT, 
curl_setopt($curl_connection, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_connection, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_connection, CURLOPT_FOLLOWLOCATION, 1);
//set data to be posted
curl_setopt($curl_connection, CURLOPT_POSTFIELDS, $post_string);

 //perform our request
$result = curl_exec($curl_connection);

 //show information regarding the request
 print_r(curl_getinfo($curl_connection));
echo curl_errno($curl_connection) . '-' . 
            curl_error($curl_connection);

// start your scrap

$t = $curl_connection->find('table',0); // ?? usualy referes to file_get_content Var
$data = $t->find('tr');

foreach($data as $n) {
$tds = $n->find('td');

$dataRows = array();

$dataRows['num'] =  $tds[0]->find('img',0)->href;

var_dump($dataRows);
}

} for loop end here

}?>

Here just skeleton to use curl in continuously with changed id you can set it your way.

also please make sure to clear you variable after fetch data.

use like

...
curl_close($ch);
unset($fields_string);
...

Load the page

The prefered way to grab remote content is file_get_contents() . Use:

$html = file_get_contents('http://www.lldj.com/pastresult.php');

Thats's it.


Get content from the page

To get content from the page you will usually use DOMDocument and DOMXPath :

$doc = new DOMDocument();
@$doc->loadHTML($html);
$selector = new DOMXpath($doc);

// xpath query
$result = $selector->query('YOUR QUERY');

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM