Extract email field with regex

Question

I'm trying to extract the "email" with this code

const regex3 = /Email',\r\n      value: '([^']*)',/gm;
var content3 = fs.readFileSync('message.txt')
let m3;

while ((m3 = regex3.exec(content)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m3.index === regex3.lastIndex) {
        regex3.lastIndex++;
    }

    // The result can be accessed through the `m`-variable.
    m3.forEach((match, groupIndex) => {
        fs.appendFileSync('messagematch.txt', m3[1] + '\n');
    });
}

From this file

 },
MessageEmbedField {
  embed: [Circular *2],
  name: 'Email',
  value: 'user@gmail.com',
  inline: true
},
MessageE

The regex code works on notepad, but doesn't on my script.. what I'm missing?

Answer 1

I suggest changing your regex in a few ways to make it more robust and fault tolerant.

First, include the initial single-quote in email to avoid accidentally catching other fields where someone may have put the word "Email" as a value.

Second, use \r?\n to capture both Windows and Unix-style line endings. I suspect this may be a large part of your issue, but can't be sure.

Third, use \s+ instead of specifically including a number of spaces. This will help to avoid problems caused by minor formatting changes.

The final regex would look like this:

const regex = /'Email',\r?\n\s+value: '([^']*)',/gm

Answer 2

what I'm missing?

You use \r\n to match a Windows style line break but you can make the \r optional to also match a Unix style. See this page about line break characters.
In your code you specify var content3 but you use it like regex3.exec(content)
Also the number of spaces in the question for the pattern and the examples data are different

You could use \s+ instead of hardcoding the number of spaces but \s can also match a newline.

If you want to match whitespaces without a newline you could use a negated character class [^\S\r\n] to match any char except a non whitespace char and a newline.

'Email',\r?\n[^\S\r\n]+value:[^\S\r\n]+'([^\s@']+@[^\s@']+)'

'Email', Match literally
\r?\n Match a newline
[^\S\r\n]+ Match 1+ whitespace chars except newlines
value: Match literally
[^\S\r\n]+' Match 1+ whitespace chars except newlines and '
( Capture group 1
- ([^\s@']+@[^\s@']+' Match an email like format
)' Close group 1 and match '

Regex demo

 const regex3 = /'Email',\r?\n[^\S\r\n]+value:[^\S\r\n]+'([^\s@']+@[^\s@']+)'/g; var content3 = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, MessageE `; let m3; while ((m3 = regex3.exec(content3)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m3.index === regex3.lastIndex) { regex3;lastIndex++. } console;log(m3[1]); }

Answer 3

Maybe, try your expression on s (single line) mode:

/Email'\s*,\s*value:\s*'([^'\r\n]*)'/gs

Test

 const regex = /Email'\s*,\s*value:\s*'([^'\r\n]*)'/gs; const str = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, MessageE `; let m; while ((m = regex.exec(str)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex;lastIndex++. } // The result can be accessed through the `m`-variable. m,forEach((match. groupIndex) => { console,log(`Found match: group ${groupIndex}; ${match}`); }); }

If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com . If you'd like, you can also watch in this link , how it would match against some sample inputs.

RegEx Circuit

jex.im visualizes regular expressions:

Answer 4

You can try something like:

 var test = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, Message `; var myregexp = /name: 'Email',\s+value: '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[AZ]{2,}\b)',/img; var match = myregexp.exec(test); console.log(match[1]);

The regex above matches valid email addresses only , if you want to match anything (as it was), use:

var myregexp = /name: 'Email',\s+value: '([^']*)',/img;

Regex Demo & Explanation

Extract email field with regex

Question

4 answers

solution1
1 2019-11-15 22:19:32

solution2
1 2019-11-16 11:38:58

solution3
0 2019-11-15 21:39:53

Test

RegEx Circuit

solution4
0 2019-11-15 21:42:14

Extract email field with regex

Question

4 answers

solution1 1 2019-11-15 22:19:32

solution2 1 2019-11-16 11:38:58

solution3 0 2019-11-15 21:39:53

Test

RegEx Circuit

solution4 0 2019-11-15 21:42:14

solution1
1 2019-11-15 22:19:32

solution2
1 2019-11-16 11:38:58

solution3
0 2019-11-15 21:39:53

solution4
0 2019-11-15 21:42:14