简体   繁体   中英

Python email parse non multipart

I have a script which parses raw emails. It works fine for multipart emails, but how do I parse non multipart emails?

mail = email.message_from_string(raw_message)
if mail.is_multipart():
    data = extract(mail)
else:
    payload = mail.get_payload(decode=True)

Raw Email:

Return-Path: <>
X-Original-To: bounces@mydomain.com
Delivered-To: bounces@mydomain.com
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12])
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11
    for <bounces@mydomain.com>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT)
Received: from localhost by inmumg01.tcs.com;
  15 Mar 2016 14:09:38 +0530
Message-Id: <5aaa80$2543de@inmumg01.tcs.com>
Date: 15 Mar 2016 14:09:38 +0530
To: bounces@mydomain.com
From: "Mail Delivery System" <mail.notification@tcs.com>
Subject: Undeliverable Message

The following message to <vipul4.j@tcs.com> was undeliverable.
The reason for the problem:
5.1.0 - Unknown address error 550-'vipul4.j@tcs.com... No such user'

The IP address of the MTA to which the message could not be sent:
172.17.9.35

---------- A copy of the message begins below this line ----------
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200";
   d="scan'208,217";a="72486315"
X-Amp-Result: Clean
X-Amp-File-Uploaded: False
Received: from smtp.mydomain.com ([139.59.240.124])
  by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530
Received: from 128.199.202.14 (unknown [128.199.202.14])
    (Authenticated sender: mailsender)
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11
    for <vipul4.j@tcs.com>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ
     GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin
     bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf
     Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7
     8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc=
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT)
From: Kitemailer Newsletter <info@kitemailer.com>
To: vipul4.j@tcs.com
Message-ID: <15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>
Subject: KiteMailer | New Features this Week
MIME-Version: 1.0
Content-Type: multipart/mixed;
    boundary="----=_Part_44_1398250960.1458031171306"
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==>
Feedback-ID: 19:96:1520615:MyDomain

Now in the else statement, I want to extract information, if I try payload['to'] it throws me an error TypeError: string indices must be integers, not str

Ok, let's say there is no way you can do it with the mail library (which I don't know), you may convert your raw message to a dictionary and get your elements:

this is is your raw message:

raw_message='''Return-Path: <>
X-Original-To: bounces@mydomain.com
Delivered-To: bounces@mydomain.com
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12])
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11
    for <bounces@mydomain.com>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT)
Received: from localhost by inmumg01.tcs.com;
  15 Mar 2016 14:09:38 +0530
Message-Id: <5aaa80$2543de@inmumg01.tcs.com>
Date: 15 Mar 2016 14:09:38 +0530
To: bounces@mydomain.com
From: "Mail Delivery System" <mail.notification@tcs.com>
Subject: Undeliverable Message

The following message to <vipul4.j@tcs.com> was undeliverable.
The reason for the problem:
5.1.0 - Unknown address error 550-'vipul4.j@tcs.com... No such user'

The IP address of the MTA to which the message could not be sent:
172.17.9.35

---------- A copy of the message begins below this line ----------
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200";
   d="scan'208,217";a="72486315"
X-Amp-Result: Clean
X-Amp-File-Uploaded: False
Received: from smtp.mydomain.com ([139.59.240.124])
  by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530
Received: from 128.199.202.14 (unknown [128.199.202.14])
    (Authenticated sender: mailsender)
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11
    for <vipul4.j@tcs.com>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ
     GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin
     bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf
     Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7
     8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc=
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT)
From: Kitemailer Newsletter <info@kitemailer.com>
To: vipul4.j@tcs.com
Message-ID: <15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>
Subject: KiteMailer | New Features this Week
MIME-Version: 1.0
Content-Type: multipart/mixed;
    boundary="----=_Part_44_1398250960.1458031171306"
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==>
Feedback-ID: 19:96:1520615:MyDomain'''

I'm using your code to get payload:

#in case it is not multipart
import email

mail = email.message_from_string(raw_message)
payload = mail.get_payload(decode=True)

mail_dico = { elt.split(":",1)[0].strip():elt.split(":", 1)[1].strip() for elt in payload.split("\n") if ":" in elt and " " not in elt.split(':')[0].strip()}

here is your dictionary:

{'Content-Type': 'multipart/mixed;',
 'DKIM-Signature': 'v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;',
 'Date': 'Tue, 15 Mar 2016 04',
 'Feedback-ID': '19',
 'From': 'Kitemailer Newsletter <info@kitemailer.com>',
 'List-Unsubscribe': '<http',
 'MIME-Version': '1.0',
 'Message-ID': '<15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>',
 'Received': 'from 128.199.202.14 (unknown [128.199.202.14])',
 'Subject': 'KiteMailer | New Features this Week',
 'To': 'vipul4.j@tcs.com',
 'X-Amp-File-Uploaded': 'False',
 'X-Amp-Result': 'Clean',
 'X-IPAS-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE',
 'X-IronPort-AV': 'E=Sophos;i="5.24,338,1454956200";',
 'X-IronPort-Anti-Spam-Filtered': 'true',
 'X-IronPort-Anti-Spam-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE',
 'h=Date': 'From'}

now you can access your elements:

print(mail_dico["To"])
>> 'vipul4.j@tcs.com'

print(mail_dico["Subject"])
>> 'KiteMailer | New Features this Week'

This is probably not the best way to do this but I hope it helped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM