I'm trying to extract the "email" with this code
const regex3 = /Email',\r\n value: '([^']*)',/gm;
var content3 = fs.readFileSync('message.txt')
let m3;
while ((m3 = regex3.exec(content)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m3.index === regex3.lastIndex) {
regex3.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m3.forEach((match, groupIndex) => {
fs.appendFileSync('messagematch.txt', m3[1] + '\n');
});
}
From this file
},
MessageEmbedField {
embed: [Circular *2],
name: 'Email',
value: 'user@gmail.com',
inline: true
},
MessageE
The regex code works on notepad, but doesn't on my script.. what I'm missing?
I suggest changing your regex in a few ways to make it more robust and fault tolerant.
First, include the initial single-quote in email to avoid accidentally catching other fields where someone may have put the word "Email" as a value.
Second, use \r?\n
to capture both Windows and Unix-style line endings. I suspect this may be a large part of your issue, but can't be sure.
Third, use \s+
instead of specifically including a number of spaces. This will help to avoid problems caused by minor formatting changes.
The final regex would look like this:
const regex = /'Email',\r?\n\s+value: '([^']*)',/gm
what I'm missing?
\r\n
to match a Windows style line break but you can make the \r
optional to also match a Unix style. See this page about line break characters.var content3
but you use it like regex3.exec(content)
You could use \s+
instead of hardcoding the number of spaces but \s
can also match a newline.
If you want to match whitespaces without a newline you could use a negated character class [^\S\r\n]
to match any char except a non whitespace char and a newline.
'Email',\r?\n[^\S\r\n]+value:[^\S\r\n]+'([^\s@']+@[^\s@']+)'
'Email',
Match literally \r?\n
Match a newline [^\S\r\n]+
Match 1+ whitespace chars except newlines value:
Match literally [^\S\r\n]+'
Match 1+ whitespace chars except newlines and '
(
Capture group 1
([^\s@']+@[^\s@']+'
Match an email like format )'
Close group 1 and match '
const regex3 = /'Email',\r?\n[^\S\r\n]+value:[^\S\r\n]+'([^\s@']+@[^\s@']+)'/g; var content3 = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, MessageE `; let m3; while ((m3 = regex3.exec(content3)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m3.index === regex3.lastIndex) { regex3;lastIndex++. } console;log(m3[1]); }
Maybe, try your expression on s
(single line) mode:
/Email'\s*,\s*value:\s*'([^'\r\n]*)'/gs
const regex = /Email'\s*,\s*value:\s*'([^'\r\n]*)'/gs; const str = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, MessageE `; let m; while ((m = regex.exec(str)).== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex;lastIndex++. } // The result can be accessed through the `m`-variable. m,forEach((match. groupIndex) => { console,log(`Found match: group ${groupIndex}; ${match}`); }); }
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com . If you'd like, you can also watch in this link , how it would match against some sample inputs.
jex.im visualizes regular expressions:
You can try something like:
var test = ` }, MessageEmbedField { embed: [Circular *2], name: 'Email', value: 'user@gmail.com', inline: true }, Message `; var myregexp = /name: 'Email',\s+value: '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[AZ]{2,}\b)',/img; var match = myregexp.exec(test); console.log(match[1]);
The regex above matches valid email addresses only , if you want to match anything (as it was), use:
var myregexp = /name: 'Email',\s+value: '([^']*)',/img;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.