简体   繁体   中英

regular expression to extract a link tag from an html string on server side

I have an HTML page source which is in string format on the server-side

I need to extract a from the string and add it to an array.There can be multiple links with the same starting tag. i need to push the extracted string to an array

the <link rel="icons"................ > can contain anything inside the tag.I have mentioned the startTag and endTag in the code below.

  var startTag = '<link rel="icons"';
  var endTag = '>';
  const re = new RegExp('(' + startTag + ')(.|\n)+?(' + endTag + ')', 'g');

However, When i console the value of re, it is not the one I expect.

DesiredOutput

['<link rel="icons" href="icons1.png"','<link rel="icons" href="icons2.png"',<link rel="icons" href="icons3.png"]

Thanks in advance.

I think you're looking for something like this (the replace is just to remove extra whitespace):

 const data = ` <link rel="icons" href="icons1.png" > <link rel="icons" href="icons2.png" > <link rel="icons" href="icons3.png" > `; const links = data.match(/<link.*?>/gs) .map(link => link.replace(/\\s+/g, ' ')); console.log(links); 

If you're in an environment that doesn't support the s flag, you could use /<link[^]*?>/g instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM