简体   繁体   中英

Processing a Raw Email Data using Java

I'm having a DB which stores raw email contents. My requirement is to fetch individual mails from the DB and process that data to fetch the basic details of that particular email (such as FROM, TO, SUBJECT, etc..) and also to get all the attachments saved to the file system using Core Java. Currently I'm able to fetch the raw email data from DB as a String, but not able to process that data.

How to process this raw email data (String data type) using Java?

Edit: In the DB level the data is stored as NCLOB. After fetching the data from the DB, it is then stored as a Java String data type.

A sample email data is:

Return-Path: <support.bpm@mydomain>
Delivered-To: faxhealthuat@mydomain.com
Received: from naplmailer2.com (unknown [172.25.3.5])
    by mail3.mydomain.com (Postfix) with ESMTP id 46E6572049B
    for <faxhealthuat@mydomain.com>; Tue, 23 Feb 2016 15:16:43 +0530 (IST)
DKIM-Signature: v=1; a=rsa-sha256; d=mydomain; s=sms2; c=relaxed/simple;
    q=dns/txt; i=@mydomain; t=1456220806; x=1458812806;
    h=From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:
    Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From:
    Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id:
    List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive;
    bh=K7Tc1XHEFN5ey8WU6/HXHF9XYDMLCiIsVdU7DloptqI=;
    b=CEnhtyGSQi+08wghYzKjW61JpO/IqOCgjopdCaesEfRgdeu86BWTQ9ZV0G7mCkDz
    XChXBhzNsj+uST6yiu7ivYsCBqKvBAnyaoUvLSUw5rWAuCNlg1gdP1ilEzFnZZBB
    6U25CK64N81I5cKCdltgmUe5B97XueIV8M8LjhyemxM=;
X-AuditID: 7370fb5c-f79a16d000001484-b0-56cc2a86383c
Received: from CHNMURROOTCAS2.murugappa.com ( [172.25.1.14])
    by naplmailer2.com (Symantec Messaging Gateway) with SMTP id 8B.42.05252.68A2CC65; Tue, 23 Feb 2016 15:16:46 +0530 (IST)
Received: from CHNMURROOTMBX2.murugappa.com ([fe80::a141:6b81:60c9:125c]) by
 CHNMURROOTCAS2.murugappa.com ([fe80::fc6b:b33c:6d4f:fadd%12]) with mapi id
 14.03.0210.002; Tue, 23 Feb 2016 15:16:40 +0530
From: Support-BPM-CholaMS <support.bpm@mydomain>
To: "faxhealthuat@mydomain.com" <faxhealthuat@mydomain.com>
Subject: Test From Mail
Thread-Topic: Test From Mail
Thread-Index: AdFuHx8uv6VR8hDtQvKILSCahVrrMg==
Date: Tue, 23 Feb 2016 09:46:39 +0000
Message-ID: <B8C5C607CDD50E4D84DACA129D4CFD64C7299C49@CHNMURROOTMBX2.murugappa.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.111.10.60]
Content-Type: multipart/alternative;
    boundary="_000_B8C5C607CDD50E4D84DACA129D4CFD64C7299C49CHNMURROOTMBX2m_"
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprMKsWRmVeSWpSXmKPExsWyRpKRT7dN60yYwe2HihYvDps7MHqs73jD
    GsAY1cBok5iXl1+SWJKqkJJanGyr5JJZnJyTmJmbWqSQll+k4JyRn5Oo4BuspJCZYqtkqqRQ
    kJOYnJqbmldiq5RYUJCal6Jkx6WAAWyAyjLzFFLzkvNTMvPSbZU8g/11LSxMLXUNlexcPIOd
    fRw9fV2DFPz8E7ayZjx+spe54LdqxeLPS9kbGBcodzFyckgImEicOvSNFcIWk7hwbz1bFyMX
    h5DAdkaJdcd3QjmnGSU+z17PCFLFJmArseJgM5gtIuAocezPNxYQW1hAXGLdxFesEHEZieWH
    l0DZehLnzl5lA7FZBFQljhzoZQaxeQWCJW7seAZWwwi0+fupNUwgNjPQnFtP5jNBXCQgsWTP
    eWYIW1Ti5eN/UJcqSLR+PwUU5wCqz5fY8cEYYqSgxMmZT1gmMArNQjJpFkLVLCRVECU6Egt2
    f2KDsLUlli18zQxjnznwmAlZfAEj+ypG/rzEgpzcxMyc1CIjveT83E2MwJgvLvgds4Px00+n
    Q4wCHIxKPLzLG06HCbEmlhVX5h5ilOBgVhLhdeA7EybEm5JYWZValB9fVJqTWnyI0QcYIhOZ
    pUST84HpKK8k3tDI3MzQzMTY0NDc2BKHsJI4b6v84TAhgXRgaspOTS1ILYIZx8TBKdXAWDgr
    40nv+6kRyxcq/0qx//f+zokw3qrXR/M3XLflqeaaHnpi6YXDN39mzZhiMLv6DceSuWerT1xS
    SrXbcnaX/LOcj/pu9XFreqSf3lJ9lfYpY/3x2BW/+wofCb7749Fzfv3j/emHsy6/eO+X4LGs
    /4fGYpbrB0733TjNmyKzQWnjBP93PfbzFnEqsRRnJBpqMRcVJwIArc+Y8CYDAAA=

--_000_B8C5C607CDD50E4D84DACA129D4CFD64C7299C49CHNMURROOTMBX2m_
Content-Type: text/plain; charset="us-ascii"
content-transfer-encoding: quoted-printable

Testing for from mail fetch

--_000_B8C5C607CDD50E4D84DACA129D4CFD64C7299C49CHNMURROOTMBX2m_
Content-Type: text/html; charset="us-ascii"
content-transfer-encoding: quoted-printable
--_000_B8C5C607CDD50E4D84DACA129D4CFD64C7299C49CHNMURROOTMBX2m_--

Assuming the string you are fetching contains the new line delimiter

String rawEmail = "YOUR EMAIL CONTENTS";
String [] lines =  rawEmail.split("\\r?\\n");
Map<String, String> attributes = new HashMap<>();
for(String line : lines)
{
    String [] tokens = line.split(":");
    if(!tokens[0].isEmpty()) 
    {
        attributes.put(tokens[0].trim(), tokens[1].isEmpty()? null : tokens[1].trim());
    }
}

Further processing for nested attributes would be done the same way

Well if you want to parse an email message, you just need to know the format of an email message. This was once defined in RFC822, obsoleted by RFC2822, obsoleted by RFC5322. You should read those documents first, and choose what part of them you want to be able to process.

At the highest level, a message in composed of lines. Those lines should be terminated with \\r\\n (CrLf), but you should not rely on that since you a getting your messages from a DB without knowing whether any pre-processing has occured. First comes a header (containing header lines) and optionaly a body separated from the header by an empty line.

Header lines or of the form HEADER_NAME:HEADER_VALUE where the header name must not begin with a space. In the header part, any line beginning with a space is a continuation line and must be concatenated to the value of previous line.

For more details, just refere to RFC 5322 .

Well, after doing some research based on your answers & comments, I got what I needed. Thank you all for your efforts.

Just sharing the same here. The below Java method will fetch the email raw data from the database, find and save all the attachments contained in the email data to the file system, and finally returns either a success or a failure message.

public static String saveAttachments(String EMAIL_ID)
{
    try
    {
        String saveDirectory = "C:\\Email\\Attachements\\";

        //Get email record from DB
        EMAIL newEMAILObj = EMAIL.getEmailDetailsForEmailId(EMAIL_ID);

        //Get email raw data into a String variable
        String emailRawData = newEMAILObj.getCONTENT();

        Session newSession = Session.getDefaultInstance(new Properties());
        InputStream inputStreamObj = new ByteArrayInputStream(emailRawData.getBytes());
        MimeMessage mimeMessageObj = new MimeMessage(newSession, inputStreamObj);
        String contentType = mimeMessageObj.getContentType();

        if (contentType.contains("multipart")) //Content may contain attachments
        {
            Multipart multiPart = (Multipart) mimeMessageObj.getContent();
            int numberOfParts = multiPart.getCount();
            for (int partCount = 0; partCount < numberOfParts; partCount++)
            {
                MimeBodyPart part = (MimeBodyPart) multiPart.getBodyPart(partCount);
                if (Part.ATTACHMENT.equalsIgnoreCase(part.getDisposition())) //This part is an attachment
                {
                    File file = new File(saveDirectory+part.getFileName());
                    part.saveFile(file);
                }
            }
        }
    }
    catch (MessagingException ex) 
    {
        return "FAILED: "+ex.getLocalizedMessage();
    }
    catch (IOException ex)
    {
        return "FAILED: "+ex.getLocalizedMessage();
    } 
    return "SUCCESS";
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM