简体   繁体   中英

Split html links tag

I have this html code in a string:

  Hello world
  <img src="mypicture.png" />
  <p>Some text in a tag</p>
  <a href="http://www.google.fr">Link to google</a> Some Text <a href="http://www.yahoo.fr">Link to yahoo</a> End of line
  <p>Some text in a tag</p>
  <a attribute="some value" href="http://www.apple.com">Link to apple</a>
  Some text

I want to convert this string into this array:

  0 => Hello world
  <img src="mypicture.png" />
  <p>Some text in a tag</p>
  <a href="

  1 => http://www.google.fr

  2 => ">Link to google</a> Some Text <a href="

  3 => http://www.yahoo.fr

  4 => ">Link to yahoo</a> End of line
  <p>Some text in a tag</p>
  <a attribute="some value" href="

  5 => http://www.apple.com

  6 => ">Link to apple</a>
  Some text

I have tried this regexp. It works fine to extract the links, but i do not manage to build my array...

  <a (.*?)href=(.*?)\"(.+?)\"(.*?)>

You can just add something to capture anything and everything before the link as well:

([\\W\\w]*?)(?:(<a .*?href=.*?\\")(.+?)(?=\\")|$)

  • Get every character until...
  • Link is found:
    • Get the link up to the href value (basically your code)
    • Get the characters up to the next quote (the url)
  • End of the text is found

Then you just need to step through each match and add the pre + link to the array, and the url to the array separately.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM