简体   繁体   English

使用 javascript 如何遍历正则表达式匹配并将字符串拆分为由超链接划分的片段数组?

[英]Using javascript how can I loop through regex matches and split string into array of pieces divided by hyperlinks?

I have a string that I retrieved from an api that looks like this:我有一个从 api 中检索到的字符串,如下所示:

"If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."

I'm trying to create an array that looks like this:我正在尝试创建一个如下所示的数组:

[
 "If you "
 <a ... </a>
 " then "
 <a ... </a>
 "."
]

Basically I want to render it as it was intended to be without just using dangerously set innerHtml approach.基本上我想按照预期的方式渲染它,而不仅仅是使用危险的设置 innerHtml 方法。 I already have my regex matches I'm just trying to figure out the smartest way to loop over them and build this.我已经有了我的正则表达式匹配我只是想找出最聪明的方法来循环它们并构建它。 I just typed this up but realized after seeing the output that its obviously flawed, I need to know where to start my substring based on the last match but can't seem to sort out how to approach this.我只是输入了这个,但在看到它明显有缺陷的输出后意识到,我需要知道从哪里开始基于最后一次匹配的子字符串,但似乎无法解决如何解决这个问题。 Any guidance appreciated任何指导表示赞赏

  let noticeTextArr: (string | JSX.Element)[] = [];
  if(notice.label !== undefined) {
    const reg = /<a.+?href="(.+?)".*?>(.+?)<\/a>/g;
    let result;
    while((result = reg.exec(notice.label)) !== null) {
      if(result.index > 0) {
        noticeTextArr.push(notice.label.substring(0, result.index))
      }
      noticeTextArr.push(<a href={result[1]}>{result[2]}</a>);      
    }
  }

Here is a little bit creepy but quite well-working regex.这是一个有点令人毛骨悚然但工作良好的正则表达式。 It's basically the same approach that you did with enhancements.这与您使用增强功能所做的方法基本相同。

function convertToJSX(text: string) {
  const regex = /<\s*a[^>]*href=["']([^>]*)["'][^>]*>(.*?)<\s*\/\s*a>/g;

  const matches = text.matchAll(regex);

  const noticeTextArr: (string | JSX.Element)[] = [];

  let lastIndex = 0;

  for (const match of matches) {
    const [fullMatch, href, content] = match;

    noticeTextArr.push(text.substring(lastIndex, match.index));
    noticeTextArr.push(<a href={href}>{content}</a>);

    lastIndex = match.index + fullMatch.length;
  }

  if (lastIndex < text.length) {
    noticeTextArr.push(text.substring(lastIndex));
  }

  return noticeTextArr;
}

You can try this:你可以试试这个:

const text = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>.";

const array = text.split(/(<a.+?href=["'].+?["'].*?>.+?<\/a>)/g);

When you split with your regex as a whole group, js split the text returning also the captured groups.当您将正则表达式作为一个整体拆分时,js 会拆分文本,同时返回捕获的组。 So I have changed the regex to removed the inner groups.所以我更改了正则表达式以删除内部组。

 const data = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>." const c = data.split(' ') let i = 0 let res = '' let arr = [] while(i< c.length){ if(c[i] === '<a') { arr.push(res) res = c[i] i++; while(!c[i].includes('</a>')) { res += " "+c[i] i++ } res += " "+c[i++] arr.push(res) res =''; } else { res +=" "+ c[i++] } } console.log(arr)

Use split with a regular expression having capturing group:split与具有捕获组的正则表达式一起使用:

 const text = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."; console.log(text.split(/(<a\\s[^>]*>[^<]*<\\/a>)/));

See how the regex works看看正则表达式是如何工作的

Explanation解释

                         EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    <a                       '<a'
--------------------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
    [^>]*                    any character except: '>' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    >                        '>'
--------------------------------------------------------------------------------
    [^<]*                    any character except: '<' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    <                        '<'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    a>                       'a>'
--------------------------------------------------------------------------------
  )                        end of \1

Because it's hard to parse an html element, I would suggest to use Document.createElement() in order to let the browser to parse and split your text:因为很难解析 html 元素,我建议使用Document.createElement()以便让浏览器解析和拆分您的文本:

 var txt = "If you <a href='https://example.com'>Click here</a> then <a href='https://example.net'>Click here</a>."; var el = document.createElement( 'html' ); el.innerHTML = txt; var result = Array.from(el.querySelector('body').childNodes).map(function(ele) { return ele.nodeType == Node.TEXT_NODE ? ele.textContent : ele.outerHTML; }); console.log(result);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM