简体   繁体   中英

Retrieving Barometric and Other Climate Data Using simple_html_dom.php

I want to periodically (once a day or so) collect the barometric pressure reading for various USA weather stations. Using simple_html_dom.php I can scrape the entire page of this site, for example ( https://www.localconditions.com/weather-alliance-nebraska/69301/ ). However, I don't know how to then parse this down to just the barometric pressure reading: in this case "30.26".

Here's the code that grabs all the html. Obviously the find('Barometer') element isn't working.

<?php
// example of how to use basic selector to retrieve HTML contents
include('simple_html_dom.php');
 
// get DOM from URL or file
$html = file_get_html('https://www.localconditions.com/weather-alliance-nebraska/69301/');

// find all span tags with class=gb1
foreach($html->find('strong') as $e)
 echo $e->outertext . '<HR>';
 
 // get an element representing the second paragraph
$element = $html->find("Barometer");

 echo $e->outertext . '<br>';
        
// extract text from HTML
echo $html->plaintext;
?>

Any advise on how to parse this?

Thanks!

As mentioned by @bato3 in his comment, queries like this are far better handled with xpath. Unfortunately, neither DOMDocument nor simplexml (which I usually use to parse xml/html) could digest the html of this site (at least not when I tried). So we have to do it with simple_html_dom and resort to (somewhat inelegant) CSS selectors and string manipulation:

$dest = $html->find("//div[class='col-sm-6 col-md-6'] > p:has(> strong)"); 
foreach($dest as $e) {
    $target = $e->innertext;
    if (strpos($target, "Barometer")!== false){
    $pressure = explode("  ", $target);
    echo $pressure[2];
    } 
}

Output:

30.25 inHg.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM