Description
Bug report
Bug description:
In certain cases, with EmailMessage
objects, encoded headers can fold with double line endings, causing breakage of flattened message objects, making messages display improperly in email clients and rendering attachments inaccessible.
Python3.11.11 does not exhibit this behaviour. But numerous other Python versions eg 3.11.9
, 3.12.x
, 3.10.x
and 3.9.x
are affected by the bug.
The specific malformation is where:
- a header is long
- a header is presented with its value presented as RFC 2047 UTF8 base64
- when decoded, the header's payload ends with a newline
\n
An example of this:
Subject: =?utf-8?B?Vm9pY2Vib3ggRmlybWE6IFRlc3QtTmFjaHJpY2h0IHVtIDEwOjAzOjM1IDA0LjA0LjI1IC0gQUdGRU8gRVMgNTIyIElUIHVwIC0gc3RhaXBzbmV0Cg==?=
Here is a test script which reproduces the problem on many/most recent Python versions:
#!/usr/bin/env python3
"""
Reproduces a bug with flattening email headers whose encoded payloads end in a newline
"""
import sys
from email import policy
from email.parser import Parser, BytesParser, HeaderParser
sampleRaw = """Date: Fri, 04 Apr 2025 10:03:35 +0200\r
From: sender@foo.com\r
Subject: =?utf-8?B?Vm9pY2Vib3ggRmlybWE6IFRlc3QtTmFjaHJpY2h0IHVtIDEwOjAzOjM1IDA0LjA0LjI1IC0gQUdGRU8gRVMgNTIyIElUIHVwIC0gc3RhaXBzbmV0Cg==?=\r
MIME-Version: 1.0\r
Content-Type: multipart/mixed;\r
boundary="235711131719"\r
To: recipient@bar.com\r
\r
This is a multi-part message in MIME format.\r
--235711131719\r
Content-Type: text/plain; charset=UTF-8; format=flowed\r
Content-Transfer-Encoding: 7bit\r
\r
This is readable part of the body\r
--235711131719--\r
\r
"""
messageObj = Parser(policy=policy.default).parsestr(sampleRaw)
print(sys.version)
print("--------")
print(messageObj.as_string())
print("--------")
Python 3.11.11 flattens the message correctly. Here is an excerpt in/around the Subject:
header:
From: sender@foo.com
Subject: Voicebox Firma: Test-Nachricht um 10:03:35 04.04.25 - AGFEO ES 522 IT
up - =?utf-8?q?staipsnet=0A?=
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="235711131719"
To: recipient@bar.com
This is a multi-part message in MIME format.
As can be seen here, the header has been re-wrapped to quoted-printable (which is fine), the embedded newline is present in the encoded payload, but when the header is folded out, it has only the one line ending. Great.
But other pythons I've tested with come up with:
From: sender@foo.com
Subject: Voicebox Firma: Test-Nachricht um 10:03:35 04.04.25 - AGFEO ES 522 IT
up - staipsnet
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="235711131719"
To: recipient@bar.com
This is a multi-part message in MIME format.
--235711131719
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
This is readable part of the body
Note the blank line after Subject:
.
When messages folded and delivered with this breakage are delivered, MTAs will often add Content-Type: text/plain
because the headers after Subject have been lost into the body.
The recipient of the above message will see this formatted as plain text in their viewing pane:
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="235711131719"
To: recipient@bar.com
This is a multi-part message in MIME format.
--235711131719
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
This is readable part of the body
If there were any attachments, their encoded representations will appear after that as gibberish strings, and the mail client won't indicate an attachment is present.
This has potentially serious security implications. If an email delivery chain has python stdlib-based mail processing at or near the end, a carefully structured malicious header can
- evade sanitisation
- nullify and/or replace headers added upstream in the chain
- inject malicious extra headers and MIME-encoded context which may cause subsequent handling steps, and/or the final MUA, to be hijacked for exploit attempts.
Workaround for me has been to subclass a policy, and implement a _fold()
method to get the parent class' folding, then .rstrip()
it, then add "\r\n"
to the end.
CPython versions tested on:
3.12
Operating systems tested on:
Linux