Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

EmailMessage objects break when folding malformed header #132105

Copy link
Copy link
Open
@davidmcnabnz

Description

@davidmcnabnz
Issue body actions

Bug report

Bug description:

In certain cases, with EmailMessage objects, encoded headers can fold with double line endings, causing breakage of flattened message objects, making messages display improperly in email clients and rendering attachments inaccessible.

Python3.11.11 does not exhibit this behaviour. But numerous other Python versions eg 3.11.9, 3.12.x, 3.10.x and 3.9.x are affected by the bug.

The specific malformation is where:

  • a header is long
  • a header is presented with its value presented as RFC 2047 UTF8 base64
  • when decoded, the header's payload ends with a newline \n

An example of this:

Subject: =?utf-8?B?Vm9pY2Vib3ggRmlybWE6IFRlc3QtTmFjaHJpY2h0IHVtIDEwOjAzOjM1IDA0LjA0LjI1IC0gQUdGRU8gRVMgNTIyIElUIHVwIC0gc3RhaXBzbmV0Cg==?=

Here is a test script which reproduces the problem on many/most recent Python versions:

#!/usr/bin/env python3
"""
Reproduces a bug with flattening email headers whose encoded payloads end in a newline
"""
import sys
from email import policy
from email.parser import Parser, BytesParser, HeaderParser

sampleRaw = """Date: Fri, 04 Apr 2025 10:03:35 +0200\r
From: sender@foo.com\r
Subject: =?utf-8?B?Vm9pY2Vib3ggRmlybWE6IFRlc3QtTmFjaHJpY2h0IHVtIDEwOjAzOjM1IDA0LjA0LjI1IC0gQUdGRU8gRVMgNTIyIElUIHVwIC0gc3RhaXBzbmV0Cg==?=\r
MIME-Version: 1.0\r
Content-Type: multipart/mixed;\r
 boundary="235711131719"\r
To: recipient@bar.com\r
\r
This is a multi-part message in MIME format.\r
--235711131719\r
Content-Type: text/plain; charset=UTF-8; format=flowed\r
Content-Transfer-Encoding: 7bit\r
\r
This is readable part of the body\r
--235711131719--\r
\r
"""

messageObj = Parser(policy=policy.default).parsestr(sampleRaw)

print(sys.version)
print("--------")
print(messageObj.as_string())
print("--------")

Python 3.11.11 flattens the message correctly. Here is an excerpt in/around the Subject: header:

From: sender@foo.com
Subject: Voicebox Firma: Test-Nachricht um 10:03:35 04.04.25 - AGFEO ES 522 IT
 up - =?utf-8?q?staipsnet=0A?=
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="235711131719"
To: recipient@bar.com

This is a multi-part message in MIME format.

As can be seen here, the header has been re-wrapped to quoted-printable (which is fine), the embedded newline is present in the encoded payload, but when the header is folded out, it has only the one line ending. Great.

But other pythons I've tested with come up with:

From: sender@foo.com
Subject: Voicebox Firma: Test-Nachricht um 10:03:35 04.04.25 - AGFEO ES 522 IT
 up - staipsnet

MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="235711131719"
To: recipient@bar.com

This is a multi-part message in MIME format.
--235711131719
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

This is readable part of the body

Note the blank line after Subject:.

When messages folded and delivered with this breakage are delivered, MTAs will often add Content-Type: text/plain because the headers after Subject have been lost into the body.

The recipient of the above message will see this formatted as plain text in their viewing pane:

MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="235711131719"
To: recipient@bar.com

This is a multi-part message in MIME format.
--235711131719
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

This is readable part of the body

If there were any attachments, their encoded representations will appear after that as gibberish strings, and the mail client won't indicate an attachment is present.

This has potentially serious security implications. If an email delivery chain has python stdlib-based mail processing at or near the end, a carefully structured malicious header can

  • evade sanitisation
  • nullify and/or replace headers added upstream in the chain
  • inject malicious extra headers and MIME-encoded context which may cause subsequent handling steps, and/or the final MUA, to be hijacked for exploit attempts.

Workaround for me has been to subclass a policy, and implement a _fold() method to get the parent class' folding, then .rstrip() it, then add "\r\n" to the end.

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirPython modules in the Lib dirtopic-emailtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.