Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of extracting actual message body out of conversation #3

Open
successful-fella opened this issue Feb 8, 2022 · 7 comments

Comments

@successful-fella
Copy link

Thank you for this project. Email is parsing properly.

I tried running it through various types of email. For conversational emails, I saw that this returns whole email body instead of actual response body. Is there a way to separate out response message and quoted conversation?

I tried by finding a pattern but it looks like Google and Microsoft Office follows their own different convention. I believe it might be different for other email clients as well.
image
image
(Google using space and date starting with "On ..." and Microsoft separating with border top styling).

Your insights on this will be helpful. Thank you!

@CaTzil
Copy link
Owner

CaTzil commented Feb 9, 2022

Hi, I don't think that it is possible due to the fact that each vendor implements replays differently.

Can you post here a raw email examples?

@successful-fella
Copy link
Author

Yes, I have attached Google mail, Yandex and Outlook.

I don't think there is a common pattern that all three follows.

Google mail and Yandex made it a bit easier as they are wrapping reply part in blockquote, but will still exclude context (a line before it). Google mail also have class gmail_quote wrapping around whole reply and context part.

gmail.txt
yandex.txt
outlook.txt
.

@CaTzil
Copy link
Owner

CaTzil commented Feb 9, 2022

Cool, thank you.

Do you think that it should be parsed with this lib?

@successful-fella
Copy link
Author

successful-fella commented Feb 9, 2022

It would be a great addition to have. Of course I am not expecting you to add it for me, I just wanted to ask if you have thought of this possibility and if you understand the pattern, I will create a PR if I did.

@CaTzil
Copy link
Owner

CaTzil commented Feb 9, 2022 via email

@successful-fella
Copy link
Author

successful-fella commented Feb 9, 2022

Do you think that it should be parsed with this lib?

Okay so I just realized that some defined line of text can be added in the email before they are sent from server (depending on type of application being built).

I got this idea from TransferWise email and pretty sure many other companies does the same.

image

@CaTzil
Copy link
Owner

CaTzil commented Feb 10, 2022

I'm not sure that I got your last message, can you elaborate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants