简体   繁体   中英

Matching all excerpts which starts and ends with specific words

I have a text which looks like:

some non interesting part
trans-top
body of first excerpt
trans-bottom
next non interesting part
trans-top
body of second excerpt
trans-bottom
non interesting part

And I want to extract all excerpts starting with trans-top and ending with trans-bottom into an array. I tried that:

match(/(?=trans-top)(.|\s)*/g)

to find strings witch starts with trans-top. And it works. Now I want to specify the end:

match(/(?=trans-top)(.|\s)*(?=trans-bottom)/g)

and it doesn't. Firebug gives me an error:

regular expression too complex

I tried many other ways, but I can't find working solution... I'm shure I made some stupid mistake:(.

This works pretty well, but it's not all in one regex:

var test = "some non interesting part\ntrans-top\nbody of first excerpt\ntrans-bottom\nnext non interesting part\ntrans-top\nbody of second excerpt\ntrans-bottom\nnon interesting part";

var matches = test.match(/(trans-top)([\s\S]*?)(trans-bottom)/gm);
for(var i=0; i<matches.length; i++) {
    matches[i] = matches[i].replace(/^trans-top|trans-bottom$/gm, '');
}

console.log(matches);

If you don't want the leading and trailing linebreaks, change the inner loop to:

matches[i] = matches[i].replace(/^trans-top[\s\S]|[\s\S]trans-bottom$/gm, '');

That should eat the linebreaks.

This tested function uses one regex and loops through picking out the contents of each match placing them all in an array which is returned:

function getParts(text) {
    var a = [];
    var re = /trans-top\s*([\S\s]*?)\s*trans-bottom/g;
    var m = re.exec(text);
    while (m != null) {
        a.push(m[1]);
        m = re.exec(text);
    }
    return a;
}

It also filters out any lealding and trailing whitespace surrounding each match contents.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM