简体   繁体   中英

regular expression in php to search particular set of data

searching for i wanted to extract a paragraph from my website . their are more then 20 paragraph tags used in the index page. the key diff. is style18 class is used 1 time and style 19 3 time in each tag. i want to search them with the content os class 18 eg. the main content


<p class="margin">
    <span class="style18">*the main content*</span>
      » <a href="https://example1.html">
        somthing</a>

        <span class="style19">[somthing]</span>
         » <a href="https://example1.html">Town</a>

         <span class="style19">[somthing]</span>
          » <a href="https://example1.html">somthing</a>

    <span class="style19">[somthing]</span> »
    <a href="https://www.example.html">somthing</a>

    <span class="style19">[somthing]</span>

</p>

<?php
  $data = file_get_contents('https://www.example.net/index.php');

  preg_match('/<title>([^<]+)<\/title>/i', $data, $matches);
  $title = $matches[1];

  echo preg_match('/(<p)\s.+\n.+(style18).+Single\sTrack(.+)\n(.+)\n(.+)\n(.+)\n.+(style19).+\n(.+)\n(.+)\n.+(style19).+\n(.+)\n(.+)\n.+(style19).+\n(.+)\n(.+)\n.+(style19).+\n\n<\/p>/i', $data, $matches);

  $img = $matches[1];

  echo $title."<br>\n";
  echo $img;
  ?>

Welcome to the community @Aerro.

If I got your question correctly, you want to extract the inner content of any span surrounded by other spans with certain rules. While this could easily break your fingers with regexp, (tree / graph) query languages like XPath would be a good approach to solve this.

Have a look at eg http://php.net/manual/en/simplexmlelement.xpath.php

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM