簡體   English   中英

從php中的字符串中提取文本

[英]Extract text from string in php

我想提取以下文本<p><b><div id="t" class="t">之間的任何文本或字符串,這是我無法運行的示例

$st = '<p><b>Auburn</b> is a city in <a href="/my/id/ala" title="auburn">Lee County</a>, <a href="/my/Alabama" title="Alabama">Alabama</a>, <a href="/my/ph" title="PH">United States</a>. It is the largest city in eastern Alabama with a 2012 population of 56,908.<sup id="test" class="test"><a href="#tst"><span>[</span>2<span>]</span></a></sup> It is a principal city of the <a href="/my/tst" title="Auburn-Opelika Metropolitan Area" class="cs">Auburn-Opelika Metropolitan Area</a>. The <a href="/my/st" title="Auburn-Opelika, AL MSA" class="vf">Auburn-Opelika, AL MSA</a> with a population of 140,247, along with the <a href="/myu/sc" title="Columbus, GA-AL MSA" class="Xd">Columbus, GA-AL MSA</a> and <a href="/my/fd" title="Tuskegee, Alabama">Tuskegee, Alabama</a>, comprises the greater <a href="/my/cdA" title="Columbus-Auburn-Opelika, GA-AL CSA" class="se">Columbus-Auburn-Opelika, GA-AL CSA</a>, a region home to 456,564 residents.</p>
<p>Auburn is a <a href="/my/te" title="College town">college town</a> and is the home of <a href="/my/As" title="Auburn University">Auburn University</a>. Auburn has been marked in recent years by rapid growth, and is currently the fastest growing metropolitan area in Alabama and the nineteenth-fastest growing metro area in the United States since 1990.<sup class="fd" style="white-space:nowrap;">[<i><a href="/my/d" title="fda"><span title="fad (August 2011)">citation needed</span></a></i>]</sup> U.S. News ranked Auburn among its top ten list of best places to live in United States for the year 2009.<sup id="d3" class="f"><a href="3"><span>[</span>3<span>]</span></a></sup> The city`s unofficial nickname is “The Loveliest Village On The Plains,” taken from a line in the poem <i><a href="/my/da" title="The Deserted Village">The Deserted Village</a></i> by <a href="/my/fs" title="Oliver Goldsmith">Oliver Goldsmith</a>: “Sweet Auburn! loveliest village of the plain...”<sup id="ds" class="dsa"><a href="dd"><span>[</span>4<span>]</span></a></sup></p>
<div id="t" class="t">';

preg_match_all('/<p><b>(.*?)<div id="t" class="t">/U', $st, $output);
$result = $output[0];
print_r($output);
echo $result;

這里不需要正則表達式,因為我們正在使用文字字符串。 只需使用帶有偏移量的strpos

<?php
    function str_between($string, $searchStart, $searchEnd, $offset = 0) {
        $startPosition = strpos($string, $searchStart, $offset);
        if ($startPosition !== false) {
            $searchStartLength = strlen($searchStart);
            $endPosition = strpos($string, $searchEnd, $startPosition + 1);
            if ($endPosition !== false) {
                return substr($string, $startPosition + $searchStartLength, $endPosition - $searchStartLength);
            }
            return substr($string, $startPosition + $searchStartLength);
        }
        return $string;
    }

    var_dump(str_between($st, '<p><b>', '<div id="t" class="t">'));
?>

DEMO

如果您仍然想使用它而不是h2ooooooo的答案,則稍加修改將對您的正則表達式有所幫助:

“ / s”告訴正則表達式在換行符之外繼續搜索。 您的$ st包含換行符,正則表達式引擎在該行處停止。

使用以下內容:

preg_match_all('/<p><b>(.*?)<div id="t" class="t">/sU', $st, $output);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM