# main.py
import email
from email.iterators import _structure
import sys
msg = email.message_from_string(sys.stdin.read())
_structure(msg)
./main.py <<EOF
From: Nathaniel Borenstein <[email protected]>
To: Ned Freed <[email protected]>
Subject: Formatted text mail
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=boundary42
--boundary42
Content-Type: text/plain; charset=us-ascii
...plain text version of message goes here....
--boundary42
Content-Type: text/richtext
.... richtext version of same message goes here ...
--boundary42
Content-Type: text/x-whatever
.... fanciest formatted version of same message goes here
...
--boundary42--
EOF
输出
multipart/alternative
text/plain
text/richtext
text/x-whatever
我可以调用电子邮件模块来获取像上面这样的多部分电子邮件的结构。如何提取电子邮件的文本/纯文本部分? (在这个特定的例子中,它应该是“......消息的纯文本版本在这里......”。)
您调用
msg.get_payload()
获取消息的有效负载,然后遍历各个部分,直到找到 text/plain
部分:
# main.py
import email
import sys
msg = email.message_from_string(sys.stdin.read())
for part in msg.get_payload():
if part.get_content_type() == 'text/plain':
print(part.get_payload())
给定您的样本输入,上面的代码产生输出:
...plain text version of message goes here....
你可以改用
email.iterators.typed_subpart_iterator
,像这样:
# main.py
import email
import email.iterators
import sys
msg = email.message_from_string(sys.stdin.read())
for part in email.iterators.typed_subpart_iterator(msg, maintype='text', subtype="plain"):
print(part.get_payload())
这会产生与前面示例相同的输出。