Bug report
Bug description:
Hi Cpython Developers,
I was testing and comparing different email parsers, and found a parsing discrepancy that seems to be a problem.
MIME-Version: 1.0
Content-Type: application/zip
Content-Disposition: attachment; filename=archive.zip
Content-Transfer-Encoding: base64
UEsDBBQAAAAIAA==
emVkIGZpbGUgY29udGVudA==
With the python's email get_payload method , the return content would stopped at the first "==" as it seems to be the default behavior of base64.b64decode.
Meanwhile, peer implementations (e.g. apache.commons.mal (java) , MimeKit (c#) , PhpMimeMailParser (php) ) will return the whole content.
Below is an running example in python.
import base64
import email
"""
Parsing the mime format
"""
request = """MIME-Version: 1.0
Content-Type: application/zip
Content-Disposition: attachment; filename=archive.zip
Content-Transfer-Encoding: base64
UEsDBBQAAAAIAA==
emVkIGZpbGUgY29udGVudA==
"""
msg = email .message_from_string (request )
print ("Part content:" , repr (msg .get_payload (decode = True )))
print ()
"""
Examples of base64
"""
contents = [
"UEsDBBQAAAAIAA==\n emVkIGZpbGUgY29udGVudA==" ,
"UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==" ,
"UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==" ,
"UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==" ,
"UEsDBBQAAAAIAA==" ,
"emVkIGZpbGUgY29udGVudA=="
]
for content in contents :
decoded_bytes = base64 .b64decode (content )
print (repr (content ), " ->" )
print (" " , decoded_bytes )
Output:
Part content: b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==\nemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'emVkIGZpbGUgY29udGVudA==' ->
b'zed file content'
Thank you,
Wei-Cheng
CPython versions tested on:
3.15
Operating systems tested on:
Linux
Reactions are currently unavailable
Bug report
Bug description:
Hi Cpython Developers,
I was testing and comparing different email parsers, and found a parsing discrepancy that seems to be a problem.
With the python's email get_payload method, the return content would stopped at the first "==" as it seems to be the default behavior of
base64.b64decode.Meanwhile, peer implementations (e.g. apache.commons.mal (java), MimeKit (c#), PhpMimeMailParser (php)) will return the whole content.
Below is an running example in python.
Output:
Thank you,
Wei-Cheng
CPython versions tested on:
3.15
Operating systems tested on:
Linux