简体   繁体   中英

Using Regex in Powershell to grab email

I have wrote a script to grab different fields in an HTML file and populate variables with the results. I'm having issues with the regular expression for grabbing the email. Here is some sample code:

$txt='<p class=FillText><a name="InternetMail_P3"></a>First.Last@company-name.com</p>'

$re='.*?'+'([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})'

if ($txt -match $re)
{
    $email1=$matches[1]
    write-host "$email1"
}

I get the following error:

Bad argument to operator '-match': parsing ".*?([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\
.)+[a-zA-Z]{2,7})([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})" - [x-y] range in reverse order..
At line:7 char:16
+ if ($txt -match <<<<  $re)
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : BadOperatorArgument

What am I missing here? Also, is there a better regex for email?

Thanks in advance.

Actually any regex that is suitable for .Net or C# will work for PowerShell . And you could find tons and tons samples at stackoverflow and inet. For example: How to Find or Validate an Email Address: The Official Standard: RFC 2822

$txt='<p class=FillText><a name="InternetMail_P3"></a>First.Last@company-name.com</p>'
$re="[a-z0-9!#\$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#\$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?"
[regex]::MAtch($txt, $re, "IgnoreCase ")

But there is also other part of this answer. Regex by nature is not very suitable to parse XML/HTML . You could find more details here: Using regular expressions to parse HTML: why not?

To provide real solution, I'm recomment first

  1. convert HTML → XHTML
  2. walk over XML tree
  3. work with individual nodes one by one, even using regex.

When it comes to email validation I usually choose the short version of RFC 2822 being:

[a-z0-9!#$%&'*+/=?^_ {|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_ {|}~-]+)*@(?:a-z0-9?.)+a-z0-9?

You can find more info about email validation here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM