Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing HF Padding-Free and GraniteLM Support #257

Merged
merged 34 commits into from
Oct 25, 2024

Conversation

aldopareja
Copy link
Member

@aldopareja aldopareja commented Oct 7, 2024

Updating the data collator for models with HF padding-free support, adding support for upcoming Granite HF model class, and updating flags/interface accordingly.

-Mustafa

@mergify mergify bot added the ci-failure label Oct 7, 2024
@aldopareja aldopareja force-pushed the ap/padding-free-hf-2 branch 2 times, most recently from 3bcd40f to 705ad43 Compare October 7, 2024 22:02
@mergify mergify bot added ci-failure and removed ci-failure labels Oct 7, 2024
@aldopareja aldopareja force-pushed the ap/padding-free-hf-2 branch from 3bb42af to 9de3409 Compare October 8, 2024 01:32
@mergify mergify bot removed the ci-failure label Oct 8, 2024
@Maxusmusti Maxusmusti changed the title Ap/padding free hf 2 Implementing HF Padding-Free and GraniteLM Suppport Oct 8, 2024
@Maxusmusti Maxusmusti changed the title Implementing HF Padding-Free and GraniteLM Suppport Implementing HF Padding-Free and GraniteLM Support Oct 8, 2024
@mergify mergify bot added ci-failure and removed ci-failure labels Oct 8, 2024
Copy link
Member

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but it looks good so far. Just need to test it

@mergify mergify bot added ci-failure dependencies Pull requests that update a dependency file and removed ci-failure labels Oct 8, 2024
Copy link
Contributor

mergify bot commented Oct 10, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @aldopareja please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems like a solid drop-in replacement with some good updates. First round of review only covers code- I'll run everything on AMD hardware as well.

This was linked to issues Oct 21, 2024
@mergify mergify bot added the ci-failure label Oct 22, 2024
@Maxusmusti Maxusmusti force-pushed the ap/padding-free-hf-2 branch from 94d3adb to a45e82b Compare October 22, 2024 15:01
@mergify mergify bot removed the ci-failure label Oct 22, 2024
Signed-off-by: Mustafa Eyceoz <[email protected]>
Signed-off-by: Mustafa Eyceoz <[email protected]>
@Maxusmusti
Copy link
Contributor

@JamesKunstle added comments for the things you wanted

@JamesKunstle
Copy link
Contributor

@Maxusmusti Much appreciated- testing on AMD now

@JamesKunstle
Copy link
Contributor

Testing passes on AMD, losses between Dolomite and Llama are in parity.

Copy link
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass and loss curves look good between Dolomite and Llama padding-free on AMD. Feel good about merging this.

@mergify mergify bot added the one-approval label Oct 24, 2024
@@ -199,7 +199,7 @@ def print_masked_samples(data, tokenizer, is_pretrain, num_proc):
def get_masked_and_orig_text(sample):
labels = sample["labels"]
input_ids = sample["input_ids"]
mask_id = get_sp_token(tokenizer, "<MASK>")[0]
mask_id = get_sp_token(tokenizer, "<|MASK|>")[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this affect existing models? Or is this purely for training-time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah only relevant during training

Copy link
Member

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RobotSail RobotSail added the hold label Oct 24, 2024
@mergify mergify bot removed the one-approval label Oct 24, 2024
@Maxusmusti Maxusmusti removed the hold label Oct 25, 2024
@Maxusmusti Maxusmusti merged commit 03d1b62 into main Oct 25, 2024
13 checks passed
@nathan-weinberg nathan-weinberg deleted the ap/padding-free-hf-2 branch October 25, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HF Padding-Free Support Granite HF Model Class support
5 participants