-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pipeline: Fail explicitly on an empty dataset #127
Conversation
8c50c7a
to
b812fa4
Compare
I pushed an update, but it only changed the commit message. |
b812fa4
to
9009a64
Compare
If at any point generate() on a block returns an empty dataset, this is a failure condition. Go ahead and raise an exception right away in that case. This change was originally a subset of this commit: aakankshaduggal@256335e Co-authored-by: shiv <[email protected]> Co-authored-by: Aakanksha Duggal <[email protected]> Co-authored-by: Kai Xu <[email protected]> Signed-off-by: Russell Bryant <[email protected]>
9009a64
to
8f03a1f
Compare
@@ -28,6 +28,7 @@ def _noop_generate(self, samples, **gen_kwargs): | |||
@patch.object(SamplePopulatorBlock, "generate", _noop_generate) | |||
@patch.object(SelectorBlock, "generate", _noop_generate) | |||
@patch("instructlab.sdg.llmblock.server_supports_batched", lambda c, m: True) | |||
@patch.object(Pipeline, "_drop_duplicates", lambda self, dataset, cols: dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is getting uglier and uglier 🫣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, can't say I felt proud about adding this line 😆
Looks good to me. I would merge it since the change has passed through Shiv, Aakanksha, Kai, and Russell. It's not clear to me I can merge though?
And the "merge pull request" button is disabled. Not clear to me I can override that? |
I have a checkbox to override it. Maybe it’s because I’m a GitHub org owner? Either way I’ll change the repo settings for now. |
…_actions/rojopolis/spellcheck-github-actions-0.41.0 Bump rojopolis/spellcheck-github-actions from 0.40.0 to 0.41.0
If at any point generate() on a block returns an empty dataset, this
is a failure condition. Go ahead and raise an exception right away in
that case.
This change was originally a subset of this commit: aakankshaduggal@256335e
Co-authored-by: shiv [email protected]
Co-authored-by: Aakanksha Duggal [email protected]
Co-authored-by: Kai Xu [email protected]
Signed-off-by: Russell Bryant [email protected]