简体   繁体   中英

Using javascript how can I loop through regex matches and split string into array of pieces divided by hyperlinks?

I have a string that I retrieved from an api that looks like this:

"If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."

I'm trying to create an array that looks like this:

[
 "If you "
 <a ... </a>
 " then "
 <a ... </a>
 "."
]

Basically I want to render it as it was intended to be without just using dangerously set innerHtml approach. I already have my regex matches I'm just trying to figure out the smartest way to loop over them and build this. I just typed this up but realized after seeing the output that its obviously flawed, I need to know where to start my substring based on the last match but can't seem to sort out how to approach this. Any guidance appreciated

  let noticeTextArr: (string | JSX.Element)[] = [];
  if(notice.label !== undefined) {
    const reg = /<a.+?href="(.+?)".*?>(.+?)<\/a>/g;
    let result;
    while((result = reg.exec(notice.label)) !== null) {
      if(result.index > 0) {
        noticeTextArr.push(notice.label.substring(0, result.index))
      }
      noticeTextArr.push(<a href={result[1]}>{result[2]}</a>);      
    }
  }

Here is a little bit creepy but quite well-working regex. It's basically the same approach that you did with enhancements.

function convertToJSX(text: string) {
  const regex = /<\s*a[^>]*href=["']([^>]*)["'][^>]*>(.*?)<\s*\/\s*a>/g;

  const matches = text.matchAll(regex);

  const noticeTextArr: (string | JSX.Element)[] = [];

  let lastIndex = 0;

  for (const match of matches) {
    const [fullMatch, href, content] = match;

    noticeTextArr.push(text.substring(lastIndex, match.index));
    noticeTextArr.push(<a href={href}>{content}</a>);

    lastIndex = match.index + fullMatch.length;
  }

  if (lastIndex < text.length) {
    noticeTextArr.push(text.substring(lastIndex));
  }

  return noticeTextArr;
}

You can try this:

const text = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>.";

const array = text.split(/(<a.+?href=["'].+?["'].*?>.+?<\/a>)/g);

When you split with your regex as a whole group, js split the text returning also the captured groups. So I have changed the regex to removed the inner groups.

 const data = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>." const c = data.split(' ') let i = 0 let res = '' let arr = [] while(i< c.length){ if(c[i] === '<a') { arr.push(res) res = c[i] i++; while(!c[i].includes('</a>')) { res += " "+c[i] i++ } res += " "+c[i++] arr.push(res) res =''; } else { res +=" "+ c[i++] } } console.log(arr)

Use split with a regular expression having capturing group:

 const text = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."; console.log(text.split(/(<a\\s[^>]*>[^<]*<\\/a>)/));

See how the regex works

Explanation

                         EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    <a                       '<a'
--------------------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
    [^>]*                    any character except: '>' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    >                        '>'
--------------------------------------------------------------------------------
    [^<]*                    any character except: '<' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    <                        '<'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    a>                       'a>'
--------------------------------------------------------------------------------
  )                        end of \1

Because it's hard to parse an html element, I would suggest to use Document.createElement() in order to let the browser to parse and split your text:

 var txt = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."; var el = document.createElement( 'html' ); el.innerHTML = txt; var result = Array.from(el.querySelector('body').childNodes).map(function(ele) { return ele.nodeType == Node.TEXT_NODE ? ele.textContent : ele.outerHTML; }); console.log(result);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM