Skip to content

Commit

Permalink
Fix mismatch in full pipeline outputs (#75)
Browse files Browse the repository at this point in the history
The full knowledge pipeline had `question` and `response` as output
columns, while the skills pipelines used `question` and `answer`.

`generate_data.py` currently expects `response` instead of `answer`.
Instead of having to deal with both, just standardize on `response`,
since that seems to be used more frequently. For example, various
prompt filenames have "response" in their names.

Signed-off-by: Russell Bryant <[email protected]>
  • Loading branch information
russellb authored Jul 3, 2024
1 parent bcb7974 commit 6251693
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ generation: |
[End of Question]
[Start of Answer]
{answer}
{response}
[End of Answer]
Begin your evaluation by providing a short explanation. Be as objective as possible. After providing your explanation, you must rate the answer on a scale of 1 to 3 as mentioned above.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ generation: |
[End of Question]
[Start of Answer]
{answer}
{response}
[End of Answer]
* Return the evaluation between [Start of Evaluation] and [End of Evaluation] tags.
* Return the score between [Start of Score] and [End of Score] tags.
start_tags: ["[Start of Evaluation]", "[Start of Score]"]
end_tags: ["[End of Evaluation]", "[End of Score]"]
end_tags: ["[End of Evaluation]", "[End of Score]"]
4 changes: 2 additions & 2 deletions src/instructlab/sdg/default_flows.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ def get_flow(self) -> list:
"client": self.client,
"model_id": self.model_id,
"model_prompt": _get_model_prompt(self.model_family),
"output_cols": ["answer"],
"output_cols": ["response"],
"batch_kwargs": {
"num_procs": 8,
"batched": self.batched,
Expand Down Expand Up @@ -471,7 +471,7 @@ def get_flow(self) -> list:
"client": self.client,
"model_id": self.model_id,
"model_prompt": _get_model_prompt(self.model_family),
"output_cols": ["answer"],
"output_cols": ["response"],
"batch_kwargs": {
"num_procs": 8,
"batched": self.batched,
Expand Down

0 comments on commit 6251693

Please sign in to comment.