简体   繁体   中英

Regex Pattern to Extract Email Data

I'm retrieving raw text (includes header, and message) from a POP server. I need to capture everything after the header which is terminated by a blank line between it and the user message.

At the same time I'm wanting to ignore anything from original messages if it's a reply. The start of a reply for the emails I'm parsing start with

------Original Message------

An example email might look like this

Return-Path: ...
...
More Email Metadata: ...

Hello from regex land, I'm glad to hear from you.
------Original Message------
Metadata: ...
...

Hey regex dude, can you help me? Thanks!

Sincerely, Me.

I need to extract "Hello from regex land, I'm glad to hear from you." and any other text/lines prior to the original message.

I'm using this regex right now (C# in multiline mode)and it seems to work except it's capturing ------Original Message------ if the body is blank. I'd rather just have a blank string instead.

^\s*$\n(.*)(\n------Original Message------)?

Edit
I haven't down voted anyone and if you happen to downvote, it's usually helpful to include comments.

Why don't you not use DotnetOpenMail ? Using a regex to do this is a wrong approach, you'd be better off using a dedicated email handler instead....

The reason for this is that you have an extra \\n inside the parenthesis. If the body is blank, there is no extra newline there. Therefore, try this:

^\s*$\r\n(.*)(^------Original Message------$)?

If you don't want the newline at the end of the body, you can still use string.Trim() on the matched part.

Note: This assumes that the input uses \\r\\n line terminators (which is required in e-mail headers according to the MIME standard).

您需要替换(\\n------Original Message------)(?=(\\n------Original Message------))前瞻不返回部分,只是为了确保它在那里

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM