簡體   English   中英

從 .eml 文件中獲取文本的最佳方法是什么?

[英]What is the best way to get text from .eml file?

我嘗試從我本地驅動器上的幾個 eml 文件訪問主題和消息正文。 現在我嘗試使用 Apache Commons Email,但有時它會循環播放而沒有錯誤。 這是我的代碼,它應該從 eml 獲取文本並將其保存到 txt:

            MimeMessage mimeMessage = MimeMessageUtils.createMimeMessage(null, file);
            MimeMessageParser parser = new MimeMessageParser(mimeMessage);

            if (parser.parse().hasPlainContent()) {
                //Trying to get text of the message
                try (FileWriter writer = new FileWriter(txtName)) {
                    writeHeaders(writer, parser);
                    writer.write(parser.parse().getPlainContent());
                } catch (IOException e) {
                    e.printStackTrace();
                }
            } else if (parser.parse().hasHtmlContent()) {
                try (FileWriter writer = new FileWriter(txtName)) {
                    writeHeaders(writer, parser);
                    String text = Jsoup.parse(parser.parse().getHtmlContent()).text();
                    writer.write(text);
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }

這里還有 writeHeaders 方法:

    private void writeHeaders(FileWriter writer, MimeMessageParser parser) throws Exception {
        writer.write("From :" + parser.getFrom() + "\n");
        writer.write("To:" + parser.getTo() + "\n");
        writer.write("Subject:" + parser.getSubject() + "\n");
        writer.write("Message:" + "\n" + "\n");
    }

這是獲取附件的方法:

          if (parser.parse().hasAttachments()) {
                //Getting and saving attachments from eml
                List<DataSource> attachments = parser.parse().getAttachmentList();
                for (DataSource attachment : attachments) {
                    if (attachment.getName() != null && !attachment.getName().isEmpty()) {
                        try (InputStream is = attachment.getInputStream()) {
                            File save = new File(saveDir + File.separator + attachment.getName());
                            FileOutputStream fos = new FileOutputStream(save);
                            byte[] buf = new byte[4096];
                            int bytesRead;
                            while ((bytesRead = is.read(buf)) != -1) {
                                fos.write(buf, 0, bytesRead);
                            }
                            fos.close();
                            if (save.getName().endsWith("eml")) {
                                parseEml(save, count);
                            }
                        } catch (Exception e) {
                            e.printStackTrace();
                        }

那么,也許有更簡單的方法來獲取文本和附件?

是的要容易得多。 Simple Java Mail (Github) 可以 讀取 .eml 文件並使內容非常易於訪問。 如果您在那里也發現了類似循環錯誤的東西(不太可能),我很樂意在那里為您提供幫助(我積極維護 Simple Java Mail):

Email email = EmailConverter.emlToEmail(emlFile);

email.getFromRecipient();
email.getSubject();
email.getPlainText();
email.getHTMLText();
email.getAttachments();
email.getEmbeddedImages();
email.getHeaders();
// etc. etc.

還支持 S/MIME 加密電子郵件(如果您有解密電子郵件所需的證書)。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM