Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF-3156 Warn user when mismatch in translated and training books #2956

Merged
merged 3 commits into from
Jan 24, 2025

Conversation

RaymondLuong3
Copy link
Collaborator

@RaymondLuong3 RaymondLuong3 commented Jan 17, 2025

The user should see warnings on the training step for two scenarios:

  • The user's training source does not contain all of the translated books (books with more than 10 translated segments). This can indicate the user is using a source other than the one they are actually translating from.
  • The user selected translated books to use for training, but did not select matching books an any of the reference projects.

Warnings will appear for the user, but they can still proceed to generate a draft.

Draft generation training warning source
Draft generation training warning books


This change is Reviewable

@RaymondLuong3 RaymondLuong3 added the will require testing PR should not be merged until testers confirm testing is complete label Jan 17, 2025
Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.

Project coverage is 82.02%. Comparing base (dc78e8a) to head (74f0258).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...neration-steps/draft-generation-steps.component.ts 92.30% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2956   +/-   ##
=======================================
  Coverage   82.02%   82.02%           
=======================================
  Files         544      544           
  Lines       31673    31685   +12     
  Branches     5127     5147   +20     
=======================================
+ Hits        25979    25989   +10     
+ Misses       4939     4927   -12     
- Partials      755      769   +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kylebuss kylebuss self-assigned this Jan 17, 2025
Copy link
Collaborator

@kylebuss kylebuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @RaymondLuong3)

@kylebuss kylebuss added ready to test and removed will require testing PR should not be merged until testers confirm testing is complete labels Jan 17, 2025
@RaymondLuong3 RaymondLuong3 force-pushed the feature/sf-3156-warn-source-book branch from 4a4ca85 to fda86e3 Compare January 20, 2025 23:02
@josephmyers
Copy link
Collaborator

Shouldn't we prevent them from proceeding to the next step in this case? I can't think of any good reason why they'd actually want this.

@RaymondLuong3
Copy link
Collaborator Author

I could go either way. If we prevented them from proceeding when not all translated books selected have a matching reference book selected to train on, the user might realize they had made a mistake. That would be a positive. So perhaps that is worth doing. For the second scenario when the user has chosen a source without books matching the target project, I think the warning is good enough. If a project decides to train on only the back translation (maybe that will happen) then they would get just the warning rather than forced not to use the BT because its translation is behind the target project.

@kylebuss
Copy link
Collaborator

What if we remove the book from the selected books list if no reference sources have it selected?

@josephmyers
Copy link
Collaborator

@kylebuss, currently only source-target pairs are included for training. So I think we're already doing that, unless you're meaning something else by "selected books list"

@RaymondLuong3, yes, good clarification. I'm only recommending we prevent proceeding for the first scenario.

@kylebuss
Copy link
Collaborator

@josephmyers, I was mainly addressing the second scenario when the target has selected books but not all of those books were selected in the source. Rather than displaying a warning we could remove the missing book from the target list. I hope this clarifies what I'm meaning.

As for the first scenario, there might be an edge case where we want the user to acknowledge before proceeding like we do with the language codes on the first step. One thought that comes to mind is if we down the road allow selected chapters to be include in training rather than the whole book.

I don't know how likely of a scenario it would be for a project to use one source for NT and another for OT (like Hebrew -> English OT and Greek -> English NT), but in that case, if they have translated books in both OT and NT but the source only has OT we would want to allow them to proceed.

@Nateowami
Copy link
Collaborator

@RaymondLuong3 I thought the wording could be improved and asked @bhartmoore who came up with these suggested changes:

  1. Draft generation training warning source:
    current: "The source for this project does not contain all the translated books. Are you sure you are using the correct source?"
    suggested: "The reference text is missing one or more of the translated books in your project. Consider choosing a reference text that contains a match for all the translated books."
    (I'm thinking users might not equate "source for this project" and "correct source" with something they've selected in SF settings, displayed on the current page as "reference books." This aims to make that more explicit so users understand what's being suggested.)
  2. Draft generation training warning books
    current: "You selected translated books but did not select matching reference books for training. Please choose matching reference books."
    suggested: "Some of the training books you selected have no matching reference books selected. Please select matching reference books."
    (Clarifying that the mismatch may only apply to some of the books.)

@RaymondLuong3 RaymondLuong3 added will require testing PR should not be merged until testers confirm testing is complete and removed ready to test labels Jan 24, 2025
@RaymondLuong3 RaymondLuong3 force-pushed the feature/sf-3156-warn-source-book branch from fda86e3 to 74f0258 Compare January 24, 2025 21:29
@RaymondLuong3
Copy link
Collaborator Author

Thanks. I've updated the wording on the warning messages. I opened SF-3173 to add the require user to fix warning task. I am going to merge this as-is since it has already gone through testing and wasn't part of the original issue.

@RaymondLuong3 RaymondLuong3 added testing complete Testing of PR is complete and should no longer hold up merging of the PR and removed will require testing PR should not be merged until testers confirm testing is complete labels Jan 24, 2025
@RaymondLuong3 RaymondLuong3 merged commit 49d1a57 into master Jan 24, 2025
16 of 17 checks passed
@RaymondLuong3 RaymondLuong3 deleted the feature/sf-3156-warn-source-book branch January 24, 2025 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing complete Testing of PR is complete and should no longer hold up merging of the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants