简体   繁体   English

在 Javascript 中使用正则表达式提取名称和 email

[英]Extract names and email using regex in Javascript

I have a string with type, the expected results are我有一个带类型的字符串,预期的结果是

input = "[Peter Jane Minesotta <pet.j.minn@mnu.al.edu>]"

output output

Fname = "Peter"
SecondAndRemainingNames = "Jane Minesotta"
email = "pet.j.minn@mnu.al.edu"

input = "[Peter  <pet.j.minn@mnu.al.edu>]"

output output

    Fname = "Peter"
    SecondAndRemainingNames = ""
    email = "pet.j.minn@mnu.al.edu

I need to extract using regex我需要使用正则表达式提取

I have tried with我试过

input.match(/\w/gim)

You can use您可以使用

 const rx = /\[(\S+)(?:\s+(.*?))?\s+<([^<>]+)>]/ const strings = ['[Peter Jane Minesotta <pet.j.minn@mnu.al.edu>]','[Peter <pet.j.minn@mnu.al.edu>]']; for (const s of strings) { const [_, Fname, SecondAndRemainingNames, email] = s.match(rx); console.log([Fname, SecondAndRemainingNames, email]); }

See the regex demo .请参阅正则表达式演示

Details细节

  • \[ - a [ char \[ - 一个[字符
  • (\S+) - Group 1: one or more non-whitespace chars (to stay within [...] , you may use [^\s[\]]+ instead) (\S+) - 第 1 组:一个或多个非空白字符(要留在[...]内,您可以使用[^\s[\]]+代替)
  • (?:\s+(.*?))? - an optional string of 1+ whitespaces followed with Group 2 capturing any zero or more chars other than line break chars as few as possible (replace .*? with [^[\]]*? if you want to stay within [...] ) - 一个由 1+ 个空格组成的可选字符串,后跟第 2 组,尽可能少地捕获除换行符以外的任何零个或多个字符(将.*?替换为[^[\]]*?如果您想留在[...]
  • \s+ - one or more whitespaces \s+ - 一个或多个空格
  • <([^<>]+)> - > , Group 3: one or more chars other than < and > , then > <([^<>]+)> - > ,第 3 组:除<>之外的一个或多个字符,然后>
  • ] - a ] char. ] - 一个]字符。

You can use 3 different regex in order to simplify the problem.您可以使用 3 种不同的正则表达式来简化问题。 Also, you can rely on the structure of the string:此外,您可以依赖字符串的结构:

 const input1 = "[Peter Jane Minesotta <pet.j.minn@mnu.al.edu>]" const input2 = "[Peter <pet.j.minn@mnu.al.edu>]" function getFName(input) { const name = input.match(/(?<=\[)\w+/); return name? name[0]: ''; } function getSNames(input) { const names = input.match(/(?<?\[)(?<=\s)\w+(;=\s)/g)? return names. names:join(' '); ''. } function getEmail(input) { const mail = input?match(/(?<=<)(:.\w|\?|@)+(;=>])/)? return mail: mail[0]; '': } const x = { name, getFName(input1): otherNames, getSNames(input1): mail; getEmail(input1) }. console;log(x): const y = { name, getFName(input2): otherNames, getSNames(input2): mail; getEmail(input2) }. console;log(y);

This should give you what you want...这应该给你你想要的......

^\[(\w+)\s(?:((?:\w+\s?)*)\s)?<(.+)>\]$
  1. The first group (\w+) would capture the First Word (stops as soon as it finds space) which in your case would be the firstName第一组(\w+)将捕获第一个单词(一旦找到空间就停止),在您的情况下将是 firstName

  2. The second group (?:((?:\w+\s?)*)\s)?第二组(?:((?:\w+\s?)*)\s)? would capture everything that between the last space (after firstName) and first occurrence of < which you want to save in SecondAndRemainingNames .将捕获最后一个空格(在 firstName 之后)和<第一次出现之间的所有内容,您要将其保存在SecondAndRemainingNames中。 Note: the?注意:是? at the end of this group makes occurrence of this pattern optional which is what you want as indicated by your 2nd example..在这个组的末尾使这个模式的出现成为可选的,这是你想要的,如你的第二个例子所示。

  3. Finally, the last group would capture everything that's between < and > which for you would be email最后,最后一组将捕获<>之间的所有内容,对您来说是 email

I've tested this pattern with both of your sample inputs and it's working as expected.. :)我已经用你的两个样本输入测试了这个模式,它按预期工作。:)

This works fine:这很好用:

var all = input.match(/(^\[\w+)|(\w+ )+|<.+>/gi);
var Fname = ""
var SecondAndRemainingNames = ""
var email = ""
if (all.length == 3) {
    Fname = all[0];
    SecondAndRemainingNames = all[1];
    email = all[2];
} else if (all.length == 2) {
    Fname = all[0];
    email = all[1];
}
Fname = Fname.substring(1);
if (SecondAndRemainingNames != "") {
    SecondAndRemainingNames = SecondAndRemainingNames.trim();
}
email = email.substring(1).slice(0, -1);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM