PHP Regex match all HTML tags

Question

I am reading contains of an HTML page for some details, I'm searching for every occurrence of a string, that string comes withing a tag, I want to read just that string only.

Example:

<a href="http://www.example.com/search?la=en&q=javascript">javascript</a>
<a href="http://www.example.com/search?la=en&q=PHP">PHP</a>

I just want to read every occurrence of tags TEXT on the basis of href tag which must contain this ( http://www.example.com/search?la=en&q= ).

Any idea?

Answer 1

SimpleHtmlDom example (isn't it pretty?):

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all links 
foreach($html->find('a') as $element) {
       echo $element->href . '<br>';
       echo $element->text; //this is what you want
}

Answer 2

If the HTML page you're reading is very regular (for instance, machine-generated according to predictable patterns), something like this would work:

preg_match('|<a\s+href="http://www.example.com/search\?la=en&q=(\w+)"\s*>\1</a>|', $page)

But if it gets any more complicated than that, regular expressions probably won't be enough for the job - you'd be better off using a full HTML parser to extract the links and check them one-by-one to find the text you want.

PHP Regex match all HTML tags

Question

2 answers

solution1
4 ACCPTED 2009-08-17 08:43:07

solution2
0 2009-08-17 08:44:59

PHP Regex match all HTML tags

Question

2 answers

solution1 4 ACCPTED 2009-08-17 08:43:07

solution2 0 2009-08-17 08:44:59

solution1
4 ACCPTED 2009-08-17 08:43:07

solution2
0 2009-08-17 08:44:59