简体   繁体   English

在 C# 中解析 .eml 文件的建议

[英]Recommendations on parsing .eml files in C#

I have a directory of .eml files that contain email conversations.我有一个包含电子邮件对话的 .eml 文件目录。 Is there a recommended approach in C# of parsing files of this type?在 C# 中是否有推荐的方法来解析这种类型的文件?

Added August 2017: Check out MimeKit: https://github.com/jstedfast/MimeKit . 2017 年 8 月添加:查看 MimeKit: https : //github.com/jstedfast/MimeKit It supports .NET Standard, so will run cross-platform.它支持 .NET Standard,因此可以跨平台运行。

Original answer: I posted a sample project to illustrate this answer to Github原始答案:向 Github发布了一个示例项目来说明这个答案

The CDO COM DLL is part of Windows/IIS and can be referenced in .net. CDO COM DLL 是 Windows/IIS 的一部分,可以在 .net 中引用。 It will provide accurate parsing and a nice object model.它将提供准确的解析和一个很好的对象模型。 Use it in conjuction with a reference to ADODB.DLL.将它与对 ADODB.DLL 的引用结合使用。

public CDO.Message LoadEmlFromFile(String emlFileName)
{
    CDO.Message msg = new CDO.MessageClass();
    ADODB.Stream stream = new ADODB.StreamClass();

    stream.Open(Type.Missing, ADODB.ConnectModeEnum.adModeUnknown, ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, String.Empty, String.Empty);
    stream.LoadFromFile(emlFileName);
    stream.Flush();
    msg.DataSource.OpenObject(stream, "_Stream");
    msg.DataSource.Save();

    stream.Close();
    return msg;
}

Follow this link for a good solution:按照此链接获得一个好的解决方案:

The summary of the article is 4 steps(The second step below is missing in the article but needed):文章总结为4个步骤(下面的第二步文章中缺少但需要):

  1. Add a reference to "Microsoft CDO for Windows 2000 Library", which can be found on the 'COM' tab in the Visual Studio 'Add reference' dialog.添加对“Microsoft CDO for Windows 2000 库”的引用,该引用可在 Visual Studio 的“添加引用”对话框的“COM”选项卡上找到。 This will add 2 references to "ADODB" and "CDO" in your project.这将在您的项目中添加 2 个对“ADODB”和“CDO”的引用。

  2. Disable embedding of Interop types for the 2 reference "ADODB" and "CDO".禁用嵌入 2 个参考“ADODB”和“CDO”的互操作类型。 (References -> ADODB -> Properties -> Set 'Embed Interop Types' to False and repeat the same for CDO) (参考 -> ADODB -> 属性 -> 将“嵌入互操作类型”设置为 False 并对 CDO 重复相同的操作)

  3. Add the following method in your code:在您的代码中添加以下方法:

     protected CDO.Message ReadMessage(String emlFileName) { CDO.Message msg = new CDO.MessageClass(); ADODB.Stream stream = new ADODB.StreamClass(); stream.Open(Type.Missing, ADODB.ConnectModeEnum.adModeUnknown, ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified, String.Empty, String.Empty); stream.LoadFromFile(emlFileName); stream.Flush(); msg.DataSource.OpenObject(stream, "_Stream"); msg.DataSource.Save(); return msg; }
  4. Call this method by passing the full path of your eml file and the CDO.Message object it returns will have all the parsed info you need including To,From, Subject, Body.通过传递您的 eml 文件的完整路径来调用此方法,它返回的 CDO.Message 对象将包含您需要的所有解析信息,包括 To、From、Subject、Body。

LumiSoft includes a Mime parser . LumiSoft包括一个Mime 解析器

Sasa includes a Mime parser as well. Sasa还包括一个 Mime 解析器。

Getting a decent MIME parser would be probably a way to go.获得一个像样的 MIME 解析器可能是一种方法。 You may try to use a free MIME parser (such as this one from codeproject) but comments from code author like this您可以尝试使用免费的 MIME 解析器(例如 codeproject 中的这个解析器),但代码作者的评论是这样的

I worked on this at about the same time that I worked on a wrapper class for MSG files.我在研究 MSG 文件的包装类的同时,我也在研究这个。 Big difference in difficulty.难度差别很大。 Where the EML wrapper class maybe took a day to read the spec and get right , the MSG wrapper class took a week. EML 包装类可能需要一天时间来阅读规范并做出正确的决定,而 MSG 包装类则需要一周时间。

made me curious about the code quality.让我对代码质量感到好奇。 I'm sure that you can hack a mime parser which parses 95% of email correctly in a few days/hours.我相信你可以破解一个 mime 解析器,它可以在几天/几小时内正确解析 95% 的电子邮件。 I'm also sure that getting right the remaining 5% will take months.我也确信要完成剩下的 5% 需要几个月的时间。 Consider handling S/MIME (encrypted and signed email), unicode, malformed emails produced by misbehaving mail clients and servers, several encoding schemas, internationalization issues, making sure that intentionally mallformed emails will not crash your app, etc...考虑处理 S/MIME(加密和签名的电子邮件)、unicode、由行为不端的邮件客户端和服务器产生的格式错误的电子邮件、几种编码模式、国际化问题,确保故意格式错误的电子邮件不会使您的应用程序崩溃,等等......

If email you need to parse are comming from single source the quick & dirty parser may be enough.如果您需要解析的电子邮件来自单一来源,那么快速和肮脏的解析器可能就足够了。 If you need to parse emails from the wild a better solution could be needed.如果您需要解析来自野外的电子邮件,则可能需要更好的解决方案。

I would recommend our Rebex Secure Mail component , but I'm sure that you get decent result with components from other vendors as well.我会推荐我们的Rebex Secure Mail 组件,但我相信您使用其他供应商的组件也能获得不错的结果。

Making sure that the parser of your choice is working correctly on the infamous "Mime Torture Sample message" prepared by Mike Crispin (co-author of MIME and IMAP RFCs).确保您选择的解析器在 Mike Crispin(MIME 和 IMAP RFC 的合著者)准备的臭名昭著的“Mime Torture Sample message”上正常工作。 Thetesting message is displayed in MIME Explorer sample and can be downloaded in the installation package. 测试消息显示在 MIME Explorer sample 中,可以在安装包中下载

Following code shows how to read and parse EML file:以下代码显示了如何读取和解析 EML 文件:

using Rebex.Mail;

MailMessage message = new MailMessage();
message.Load("file.eml");

What you probably need is an email/MIME parser.您可能需要的是电子邮件/MIME 解析器。 Parsing all the header field is not very hard, but separating out various MIME types like images, attachments, various text and html parts etc. can get very complex.解析所有的头字段并不是很困难,但是分离出各种 MIME 类型,如图像、附件、各种文本和 html 部分等会变得非常复杂。

We use a third party tool but there are many C# tools/libraries out there.我们使用第三方工具,但有很多 C# 工具/库。 Search for free C# email MIME parser in Google.在 Google 中搜索免费的 C# 电子邮件 MIME 解析器。 Like I got this one:就像我得到了这个:

http://www.codeproject.com/Articles/11882/Advanced-MIME-Parser-Creator-Editor http://www.lumisoft.ee/lswww/download/downloads/Net/info.txt http://www.codeproject.com/Articles/11882/Advanced-MIME-Parser-Creator-Editor http://www.lumisoft.ee/lswww/download/downloads/Net/info.txt

I just started using the Mime-part of Papercut for this.为此,我刚刚开始使用Papercut的 Mime 部分。 It seems decent and simple at first sight.乍一看,它看起来体面而简单。

    public void ProcessRawContents(string raw)
    {
        // NB: empty lines may be relevant for interpretation and for content !!
        var lRawLines = raw.Split(new []{"\r\n"}, StringSplitOptions.None);
        var lMailReader = new MimeReader(lRawLines);
        var lMimeEntity = lMailReader.CreateMimeEntity();
        MailMessageEx Email = lMimeEntity.ToMailMessageEx();
        // ...
    }

( MailMessageEx is, of course, derived from MailMessage .) (当然, MailMessageEx是从MailMessage派生的。)

Aspose.Email for .NET Aspose.Email for .NET

Aspose.Email for .NET is a collection of components for working with emails from within your .NET applications. Aspose.Email for .NET 是一组用于在 .NET 应用程序中处理电子邮件的组件。 It makes it easy to work with a number of email message formats and message storage files (PST/OST) along with message sending and receiving capabilities.它可以轻松处理多种电子邮件消息格式和消息存储文件 (PST/OST) 以及消息发送和接收功能。

Aspose.Email makes it easy to create, read and manipulate a number of message formats such as MSG, EML , EMLX and MHT files without the need of installing Microsoft Outlook . Aspose.Email 可以轻松创建、阅读和操作多种消息格式,例如 MSG、 EML 、EMLX 和 MHT 文件,而无需安装 Microsoft Outlook You can not only change the message contents, but also manipulate (add, extract and remove) attachments from a message object.您不仅可以更改消息内容,还可以操作(添加、提取和删除)消息对象中的附件。 You can customize message headers by adding or removing recipients, changing the subject or other properties.您可以通过添加或删除收件人、更改主题或其他属性来自定义邮件标题。 It also gives you complete control over an email message by providing access to its Mapi properties.它还通过提供对其 Mapi 属性的访问权限,使您可以完全控制电子邮件。

C# Outlook MSG file reader without the need for Outlook C# Outlook MSG 文件阅读器,无需 Outlook

MSGReader is a C# .NET 4.0 library to read Outlook MSG and EML (Mime 1.0) files. MSGReader 是一个 C# .NET 4.0 库,用于读取 Outlook MSG 和 EML (Mime 1.0) 文件。 Almost all common object in Outlook are supported.几乎支持 Outlook 中的所有常见对象。

Try:尝试:

  • febootimail feboot邮件
  • SmtpExpress SMTP快递
  • LinkWS Newsletter Turbo LinkWS 时事通讯 Turbo
  • emlBridge - importing eml files into Outlook and virtually any other email client emlBridge - 将 eml 文件导入 Outlook 和几乎任何其他电子邮件客户端
  • Newsletter 2.1 Turbo时事通讯 2.1 Turbo
  • ThunderStor (emlResender) ThunderStor (emlResender)
  • Ruby (using eml2mbox ). Ruby(使用eml2mbox )。 See jimbob method .jimbob方法
  • Evolution - create new message, attach the eml file, Evolution - 创建新消息,附加 eml 文件,

Write a program:编写程序:

Workarounds:解决方法:

  • $ cat mail.eml | $ cat mail.eml | mail -s -c But headers won't be parsed, neither attachments. mail -s -c 但是不会解析标题,也不会解析附件。
  • drop them into your GMail (Firefox will save them as attachments)将它们放入您的 GMail(Firefox 会将它们保存为附件)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM