格式化主题电子邮件

问题描述 投票:0回答:1

我想恢复电子邮件的“ SUBJECT”值

import imaplib
import os
import email

email_user = 'xxxxxxx@xxxxxxx'
email_pass = 'xxxxxxxx'

M = imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login(email_user, email_pass)
M.select('INBOX')

typ, message_numbers = M.search(None, 'ALL')

num = b'2420'
typ, data = M.fetch(num, '(RFC822)')

raw_email = data[0][1].decode('utf-8')
email_message = email.message_from_string(raw_email)

print(email_message['Subject'])

值是

=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?=
=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?=

但是我想要这个编码

[NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée

谢谢

python python-3.x email imaplib
1个回答
0
投票

这几乎可以满足您的需要,]:>

  • 获取每条编码行
  • 提取需要解码的部分
  • 将其转换为
    • 用空格替换每个'_'
    • 用该十六进制代码用字节替换每个'= XX'
    • 将所有其他字符保留原样
  • 将整个结果解码为UTF-8字节数组
  • import re
    
    subject = [
        '=?UTF-8?Q?=5BNAS=5FLEBARS=5D_Active_Backup_for_Business_=2D_La_t=C3=A2che_?=',
        '=?UTF-8?Q?de_sauvegarde_DBS_=2D_SIDEXIS_sur_NAS=5FLEBARS_est_termin=C3=A9e?='
    ]
    
    
    def convert_content(content):
        iter_content = iter(content)
        try:
            while True:
                ch = next(iter_content)
                if ch == '_':
                    yield b' '
                elif ch == '=':
                    yield bytearray.fromhex(next(iter_content)+next(iter_content))
                else:
                    yield ch.encode('utf-8')
        except StopIteration:
            pass
    
    
    def process(data):
        for line in data:
            m = re.match(r'=\?(?:utf|UTF)-8\?(?:q|Q)\?(.*)\?=', line)
            yield b''.join(convert_content(m.group(1))).decode('utf-8')
    
    
    print(''.join(process(subject)))
    

    输出:

    [NAS_LEBARS] Active Backup for Business - La tâche de sauvegarde DBS - SIDEXIS sur NAS_LEBARS est terminée
    
    © www.soinside.com 2019 - 2024. All rights reserved.