-
Notifications
You must be signed in to change notification settings - Fork 728
Cm3 integration #727
base: main
Are you sure you want to change the base?
Cm3 integration #727
Conversation
- make cm3 compliant with main branch
- revert inference mode
- revert inference mode
This has some specific logic to CM3leon project (i.e., img token conversions), are we sure we want to land this in main? Do we want to possible pick a branch and merge everything into there and periodically update to main? |
process_group=distributed_utils.get_data_parallel_group(), | ||
) | ||
model = task.build_model(cfg.model) | ||
if not isinstance(model, FullyShardedDataParallel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, this is for loading up consolidated model for training?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I added support to change the MP size during job lunch, and for that I need to wrap it in FullyShardedDataParallel
inside the build_model
.
As I don't want to double wrap it, I needed to add this if..
@@ -4,6 +4,7 @@ | |||
# LICENSE file in the root directory of this source tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are changes to the cm3 objectives that i landed in scaling_racm3 correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
Code LGTM. @suchenzang to give guidance on whether or not to land here. |
@@ -246,6 +281,10 @@ def _check_cm3_parameterization(self): | |||
|
|||
def _create_cm3_special_tokens(self): | |||
self.cm3_sentinel_end = "<eoss>" | |||
self.cm3_break = "<racm3:break>" | |||
self.dictionary.add_symbol(self.cm3_break) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's all looking great.
We want to make a change here to recycle unused embedding index of cm3_break and sentinel for the next version.
Should i just add a commit on top of this PR? Or do I file a different PR?
@@ -200,24 +209,31 @@ def get_document_boundaries(self, item: torch.Tensor): | |||
boundaries = boundaries + [item.size(0)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is get_document_boundaries() robust to the case that there is no break tokens
No description provided.