簡體   English   中英

正則表達式替換:替換文本,而不是代碼

[英]Regex substitution: Replace texts, not codes

我幾天來一直在嘗試解決正則表達式的測驗,但仍然無法做對。 我已經很接近了,但仍然無法通過。

任務:

在 HTML 頁面中,將文本micro替換為&micro; . 哦,不要搞砸代碼:不要在<the tags>&entities;里面替換&entities;

代替

  • micro -> &micro;
  • abc micro -> abc &micro;
  • micromicro -> &micro;&micro;
  • &micro;micro -> &micro;&micro;

不要碰

  • <tag micro /> -> <tag micro />
  • &micro; -> &micro;
  • &abcmicro123; -> &abcmicro123;

我試過這個,但它在最后一個&micro;上失敗了&micro; , 我錯過了什么? 有人可以指出我錯過了什么嗎? 提前致謝!

我嘗試過的:

正則表達式

((?:\G|\n)(?:.*?&.*?micro.*?;[\s\S]*?|.*?<.*?micro.*?>[\s\S]*?|.)*?)micro

代換

$1&micro;

你可以嘗試這樣的事情:

(?:<.*?>|&\\w++;)(*SKIP)(*F)|micro

替換字符串:

&micro;

使用SKIP-FAIL 技術,但作為一個整體匹配:

(?:<[^<>]*>|&\w+;)(*SKIP)(*F)|\bmicro\b

查看證明

解釋

--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    <                        '<'
--------------------------------------------------------------------------------
    [^<>]*                   any character except: '<', '>' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    >                        '>'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    &                        '&'
--------------------------------------------------------------------------------
    \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    ;                        ';'
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (*SKIP)(*F)              Skip the match and go on matching from current location
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  micro                    'micro'
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char

 var strings = [ "micro", "abc micro", "micromicro", "&micro;micro", "<tag micro />", "&micro;", "&abcmicro123;" ]; var re = /(?<!(<[^>]*|&[^;]*))(micro)/g; strings.forEach(function(str) { var result = str.replace(re, '&$2;') console.log(str + ' -> ' + result) });

控制台日志輸出:

micro -> &micro;
abc micro -> abc &micro;
micromicro -> &micro;&micro;
&micro;micro -> &micro;&micro;
<tag micro /> -> <tag micro />
&micro; -> &micro;
&abcmicro123; -> &abcmicro123;

解釋:

  • 使用(?<!...) - 負向后視排除微內部標簽或實體
  • (<[^>]*|&[^;]*) - 在負前瞻中跳過<...> OR '&...;'
  • (micro) - 捕獲您的標簽(根據需要添加多個,例如(micro|brewery)
  • '&$2;' - 替換將捕獲的標簽變成實體&...;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM