简体   繁体   English

解析电子邮件正文

[英]Parsing an email message body

I'm using the gmail API to parse through my gmail message body.我正在使用 gmail API 解析我的 gmail 邮件正文。 It works other than when the body is in an html.除了正文在 html 中时,它也能工作。 Does anyone know how I can just extract the text within the email?有谁知道我如何提取电子邮件中的文本? If not, how I can just ignore emails with html?如果没有,我怎么能忽略带有 html 的电子邮件?

Eventually I want to implement this for personal/professional emails in which there likely won't be html in it.最终,我想为个人/专业电子邮件实现这一点,其中可能没有 html。

def message_converter(message_id):
        message = service.users().messages().get(userId='me', id=message_id,format='raw').execute()
        msg_str = str(base64.urlsafe_b64decode(message['raw'].encode('ASCII')),'UTF-8')
        mime_msg = email.message_from_string(msg_str)
        if mime_msg.is_multipart():
            for payload in mime_msg.get_payload():
                # if payload.is_multipart(): ...
                print (payload.get_payload())
        else:
            print (mime_msg.get_payload())

html2text does a pretty good job - it converts HTML into ASCII text. html2text做得很好 - 它将 HTML 转换为 ASCII 文本。

You may need to do additional parsing/formatting after the fact, however.但是,您可能需要在事后进行额外的解析/格式化。

i dont know if this can help you but Gmail Api have the same syntax so in C# you can find the text message in 3 places (it depends on the mail server) so :我不知道这是否可以帮助您,但 Gmail Api 具有相同的语法,因此在 C# 中,您可以在 3 个位置(取决于邮件服务器)找到文本消息,因此:

msg.Payload.Parts[1].Body.Data;  // here you can find text message without HTML tag

msg.Payload.Parts[0].Body.Data; // here you can find text message with HTML tag

msg.Payload.Body.Data; // and here you dont have a choice you have the HTMl tag

This answer may help you do what you are heading to.这个答案可能会帮助你做你想做的事情。 I understand that you wanna get certain texts out of the body of the email.我知道您想从电子邮件正文中获取某些文本。 You may use regular expressions to do that.您可以使用正则表达式来做到这一点。 I made a video explaining how to get data out of Gmail email body but using Google App Script (JavaScript):我制作了一个视频,解释了如何使用 Google App Script (JavaScript) 从 Gmail 电子邮件正文中获取数据:

https://youtu.be/nI1OH3pAz6s?t=8 https://youtu.be/nI1OH3pAz6s?t=8

You download the code from GitHub link:您从 GitHub 链接下载代码:

https://gist.github.com/MoayadAbuRmilah/5835369fdebbecf980029f7339e4d769 https://gist.github.com/MoayadAbuRmilah/5835369fdebbecf980029f7339e4d769

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM