Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing headers fails if trailing newline is missing #90

Open
liclac opened this issue Nov 5, 2024 · 0 comments
Open

Parsing headers fails if trailing newline is missing #90

liclac opened this issue Nov 5, 2024 · 0 comments

Comments

@liclac
Copy link

liclac commented Nov 5, 2024

I'm using mail_parser to parse an email containing a Delivery Status Notification. They come in a message part that looks something like this:

--=_360a803c773b6a010a411a7b=0d5cefb0-e4ef-5aa9-80ba-ac9ecbbeedf1_=
Content-Type: message/delivery-status

Arrival-Date: Mon, 26 Jul 2021 10:19:42 +0200 (CEST)
Reporting-MTA: dns; example.com

Final-Recipient: rfc822; [email protected]
Status: 2.0.0
Original-Recipient: rfc822; [email protected]
--=_360a803c773b6a010a411a7b=0d5cefb0-e4ef-5aa9-80ba-ac9ecbbeedf1_=--

So the part itself has a single Content-Type: message/delivery-status header, followed by a group of "per-message" headers, followed by a group (or more) of "per-recipient" headers, all of which ends up in the "body" of the part. So I grab the part, run MessageParser on the body, grab the headers, run it again on the body of the new message, and so on.

It all works swimmingly right up until the end: because Delivery Status Notifications generally don't contain a trailing newline, msg.headers() will contain the last header, but with HeaderValue::Empty. Now, this is admittedly a bit of an abuse of the parser, and headers do normally contain a trailing CLRF, but I feel like if the message from the parser's perspective appears to end inside of a header value, the more expected behaviour would be to return what it has as the value.

Here's a test to reproduce this:

     #[test]
     fn test_headers_no_trailing_crlf() {
         use mail_parser::{Header, HeaderName, HeaderValue, MessageParser};
 
         let expected = &[
             Header {
                 name: HeaderName::Other("Foo".into()),
                 value: HeaderValue::Text("foo".into()),
                 offset_field: 0,
                 offset_start: 4,
                 offset_end: 10,
             },
             Header {
                 name: HeaderName::Other("Bar".into()),
                 value: HeaderValue::Text("bar".into()),
                 offset_field: 10,
                 offset_start: 14,
                 offset_end: 20,
             },
         ];
 
         // With trailing newline.
         let msg = MessageParser::new()
             .parse("Foo: foo\r\nBar: bar\r\n")
             .unwrap();
         assert_eq!(expected, msg.headers());
 
         // Without trailing newline.
         let msg = MessageParser::new().parse("Foo: foo\r\nBar: bar").unwrap();
         assert_eq!(expected, msg.headers());
     }

With the pretty_assertions::assert_eq! macro, this produces:

Diff < left / right > :
 [
     Header {
         name: Other(
             "Foo",
         ),
         value: Text(
             "foo",
         ),
         offset_field: 0,
         offset_start: 4,
         offset_end: 10,
     },
     Header {
         name: Other(
             "Bar",
         ),
<        value: Text(
<            "bar",
<        ),
>        value: Empty,
         offset_field: 10,
         offset_start: 14,
<        offset_end: 20,
>        offset_end: 18,
     },
 ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant