Javascript正则表达式-如何在<和>之间不匹配子字符串

Question

I'm using this regular expression: 我正在使用以下正则表达式：

var regex = /\<.*?.\>/g

to match with this string: 与此字符串匹配：

var str = 'This <is> a string to <use> to test the <regular> expression'

using a simple match: 使用简单的匹配：

str.match(regex)

and, as expected, I get: 和预期的一样，我得到：

["<is>", "<use>", "<regular>"]

(But without the backslashes, sorry for any potential confusion) （但是没有反斜杠，对于可能造成的混乱，我们深表歉意）

How can I get the reverse result? 如何获得相反的结果？ ie what regular expression do I need that does not return those items contained between < and > ? 即我需要什么不返回<和>之间包含的那些项目的正则表达式？

I tried /(^\\<.*?\\>)/g and various other similar combos including square brackets and stuff. 我尝试了/(^\\<.*?\\>)/g和其他各种类似的组合，包括方括号和其他东西。 I've got loads of cool results, just nothing that is quite what I want. 我得到了很多很棒的结果，只是我所不想要的。

Where I'm going with this: Basically I want to search and replace occurences of substrings but I want to exclude some of the search space, probably using < and >. 我要去的地方：基本上我想搜索并替换子字符串的出现，但是我想排除一些搜索空间，可能使用<和>。 I don't really want a destructive method as I don't want to break apart strings, change them, and worry about reconstructing them. 我真的不需要破坏性的方法，因为我不想分解字符串，更改它们并担心重建它们。

Of course I could do this 'manually' by searching through the string but I figured regular expressions should be able to handle this rather well. 当然，我可以通过搜索字符串来“手动”执行此操作，但我认为正则表达式应该能够很好地处理此问题。 Alas, my knowledge is not where it needs to be!! las，我的知识不在那里！！

Answer 1

Here's a way to do custom replacement of everything outside of the tags, and to strip the tags from the tagged parts http://jsfiddle.net/tcATT/ 这是一种对标记之外的所有内容进行自定义替换，并从标记的部分中剥离标记的方法，网址为http://jsfiddle.net/tcATT/

var string = 'This <is> a string to <use> to test the <regular> expression';
// The regular expression matches everything, but each val is either a
// tagged value (<is> <regular>), or the text you actually want to replace
// you need to decide that in the replacer function
console.log(str.replace( /[^<>]+|<.*?>/g, function(val){
    if(val.charAt(0) == '<' && val.charAt(val.length - 1) == '>') {
      // Just strip the < and > from the ends
      return val.slice(1,-1);
    } else {
      // Do whatever you want with val here, I'm upcasing for simplicity
      return val.toUpperCase(); 
    }
} ));
// outputs: "THIS is A STRING TO use TO TEST THE regular EXPRESSION"

To generalize it, you could use 概括地说，您可以使用

function replaceOutsideTags(str, replacer) {
    return str.replace( /[^<>]+|<.*?>/g, function(val){
        if(val.charAt(0) == '<' && val.charAt(val.length - 1) == '>') {
          // Just strip the < and > from the ends
          return val.slice(1,-1);
        } else {
          // Let the caller decide how to replace the parts that need replacing
          return replacer(val); 
        }
    })
}
// And call it like
console.log(
    replaceOutsideTags( str, function(val){
        return val.toUpperCase();
    })
);

Answer 2

If I understand correctly you want to apply some custom processing to a string except parts that are protected (enclosed in with < and > )? 如果我理解正确，您想对字符串进行一些自定义处理，但受保护的部分（用<和>括起来）除外？ If, this is the case you could do it like this: 如果是这种情况，您可以这样做：

// The function that processes unprotected parts
function process(s) {
    // an example could be transforming whole part to uppercase:
    return s.toUpperCase();
}

// The function that splits string into chunks and applies processing
// to unprotected parts
function applyProcessing (s) {
    var a = s.split(/<|>/),
        out = '';

    for (var i=0; i<a.length; i++)
        out += i%2
                ? a[i]
                : process(a[i]);

    return out;
}

// now we just call the applyProcessing()
var str1 = 'This <is> a string to <use> to test the <regular> expression';
console.log(applyProcessing(str1));
// This outputs:
// "THIS is A STRING TO use TO TEST THE regular EXPRESSION"

// and another string:
var str2 = '<do not process this part!> The <rest> of the a <string>.';
console.log(applyProcessing(str2));
// This outputs:
// "do not process this part! THE rest OF THE A string."

This is basically it. 基本上就是这个。 It returns the whole string with the unprotected parts processed. 它返回处理了未保护部分的整个字符串。

Please note that the splitting will not work correctly if the angle brackets ( < and > ) are not balanced. 请注意，如果尖括号（ <和> ）不平衡，则分割将无法正常进行。

There are various places that could be improved but I'll leave that as an excersize to the reader. 有很多地方可以改进，但我会以读者为准。 ;p ; p

Answer 3

This is a perfect application for passing a regex argument to the core String.split() method: 这是将正则表达式参数传递给核心String.split()方法的完美应用程序：

var results = str.split(/<[^<>]*>/);

Simple! 简单！

Answer 4

Using the variables you've already created, try using replace . 使用您已经创建的变量，尝试使用replace 。 It's non-destructive, too. 它也是非破坏性的。

str.replace(regex, '');
--> "This  a string to  to test the  expression"

Answer 5

/\b[^<\W]\w*(?!>)\b/g

This works, test it out: 这可行，进行测试：

var str = 'This <is> a string to <use> to test the <regular> expression.';
var regex = /\<.*?.>/g;
console.dir(str.match(regex));
var regex2 = /\b[^<\W]\w*(?!>)\b/g;
console.dir(str.match(regex2));

Answer 6

Ah, okay, sorry - I misunderstood your question. 啊，好的，抱歉-我误解了你的问题。 This is a difficult problem to solve with pure regular expressions in javascript, because javascript doesn't support lookbehinds, and usually I think I would use lookaheads and lookbehinds to solve this. 用javascript中的纯正则表达式很难解决这个问题，因为javascript不支持lookbehinds，通常我想我会使用lookaheads和lookbehinds解决此问题。 A (sort of contrived) way of doing it would be something like this: 一种（某种伪造的）方法是这样的：

str.replace(/((?:<[^>]+>)?)([^<]*)/g, function (m, sep, s) { return sep + s.replace('test', 'FOO'); })

// --> "This <is> a string to <use> to FOO the <regular> expression"

This also works on strings like "This test <is> a string to <use> to test the <regular> expression" , and if you use /test/g instead of 'test' in the replacer function, it will also turn 这也适用于"This test <is> a string to <use> to test the <regular> expression"字符串”之类"This test <is> a string to <use> to test the <regular> expression" ，如果在替换器函数中使用/test/g而不是'test' ，它也会

"This test <is> a string to <use> to test the test <regular> expression"

into 进入

"This FOO <is> a string to <use> to FOO the FOO <regular> expression"

UPDATE 更新

And something like this would also strip the <> characters: 像这样的东西也会去除<>字符：

str.replace(/((?:<[^>]+>)?)([^<]*)/g, function (m, sep, s) { return sep.replace(/[<>]/g, '') + s.replace(/test/g, 'FOO'); })

"This test <is> a string to <use> to test the test <regular> expression"
--> "This FOO is a string to use to FOO the FOO regular expression"

Answer 7

Try this regex: 试试这个正则表达式：

\b\w+\b(?!>)

UPDATE 更新

To support spaces inside brackets try this one. 要在方括号内支撑空格，请尝试使用此方法。 It's not pure regex.match, but it works and it's much simpler that the answer above: 这不是纯粹的regex.match，但它可以工作，并且上面的答案要简单得多：

alert('This <is> a string to <use use> to test the <regular> expression'.split(/\s*<.+?>\s*/).join(' '));

Javascript正则表达式-如何在<和>之间不匹配子字符串

问题描述

7 个解决方案

解决方案1
3 已采纳 2012-11-07 22:18:13

解决方案2
3 2012-11-07 22:28:37

解决方案3
3 2012-11-08 00:09:52

解决方案4
1 2012-11-07 21:40:25

解决方案5
1 2012-11-07 22:37:16

解决方案6
-1 2012-11-07 21:47:25

解决方案7
-1 2012-11-07 21:53:07

Javascript正则表达式-如何在&lt;和&gt;之间不匹配子字符串

问题描述

7 个解决方案

解决方案1 3 已采纳 2012-11-07 22:18:13

解决方案2 3 2012-11-07 22:28:37

解决方案3 3 2012-11-08 00:09:52

解决方案4 1 2012-11-07 21:40:25

解决方案5 1 2012-11-07 22:37:16

解决方案6 -1 2012-11-07 21:47:25

解决方案7 -1 2012-11-07 21:53:07

Javascript正则表达式-如何在<和>之间不匹配子字符串

解决方案1
3 已采纳 2012-11-07 22:18:13

解决方案2
3 2012-11-07 22:28:37

解决方案3
3 2012-11-08 00:09:52

解决方案4
1 2012-11-07 21:40:25

解决方案5
1 2012-11-07 22:37:16

解决方案6
-1 2012-11-07 21:47:25

解决方案7
-1 2012-11-07 21:53:07