简体   繁体   中英

Regex, JS: Match text to end of line after specific word without including word in match

I am trying to create a regular expression to parse a document for a Node.js application. The regex I have created matches everything in a line after a specific word. However, I cannot find out how to exclude the specific word from the match. This is problematic because the specific word can have a variable amount of spaces between itself meaning I can't use a look behind to exclude the word. How can I exclude this word from my match?

https://regex101.com/r/kk7Lxe/2

The Regular Expression to match only links is

/^\s*\|links\s*?=\s*(.*)$/m

This will capture the value of links into capture group 1, which you can reference as match[1] . In JavaScript, this looks like this:

 const str = ` {{Song box 2 |color = black; color:#D7DA5F |image = Kokoropv.jpg |title = "'''ココロ'''" * Romaji: Kokoro * Official English: Heart |date = March 2, 2008 |views = {{v|nn|2,738,496}} |singers = [[Kagamine Rin]] act1 |producers = [[Toraboruta-P]] (music, lyrics, illustration) |links = {{l|nn|sm2500648}} {{l|mz|266689|defunct}} |links = {{l|nn|sm2500648}} {{l|mz|266689|defunct}} }} ` const match = str.match(/^\s*\|links\s*?=\s*(.*)$/m) const links = match && match[1] console.log(links)

Advanced Solution

Personally I'd do a more generic solution that parses this list into an object and lets you easily reference all keys and values as needed:

 const getKeywordValuePairs = str => { const pattern = /^\s*\|(.*?)\s*?=\s*(.*)$/gm const result = {} let match while(match = pattern.exec(str)) { const [unused, key, value] = match result[key] = value } return result } const result = getKeywordValuePairs(` {{Song box 2 |color = black; color:#D7DA5F |image = Kokoropv.jpg |title = "'''ココロ'''" * Romaji: Kokoro * Official English: Heart |date = March 2, 2008 |views = {{v|nn|2,738,496}} |singers = [[Kagamine Rin]] act1 |producers = [[Toraboruta-P]] (music, lyrics, illustration) |links = {{l|nn|sm2500648}} {{l|mz|266689|defunct}} |links = {{l|nn|sm2500648}} {{l|mz|266689|defunct}} }} `) console.log(result) console.log(result.links)

You could do something like this, where you match each row you want and then you remove the part you don't need with String#match and String#replace ( also Array#map to transform each row that you previously matched)

I had originally attempted to use reg.exec(data) but that only matches the first solution you want

 const data = `{{Song box 2 |color = black; color:#D7DA5F |image = Kokoropv.jpg |title = "'''ココロ'''" * Romaji: Kokoro * Official English: Heart |date = March 2, 2008 |views = {{v|nn|2,738,496}} |singers = [[Kagamine Rin]] act1 |producers = [[Toraboruta-P]] (music, lyrics, illustration) |links = {{aaal|nn|sm2500648}} {{l|mz|266689|defunct}} |links = {{l|nn|sm2500648}} {{l|mz|266689|defunct}} }}`; const reg = /\|links\s*=\s*[^\n]+/g // destructuring same as // reg.exec(data)[1] const res = (data.match(reg)||[]).map(row=>row.replace(/\|links\s*=\s*/g, "")); console.log(res);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM