Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Split the question and answer before creating training data
The training data needs the generated question and answer separately. The full pipeline handles this. The simple one at the moment generates it as a single text blob. This hack splits them apart. This is all because the small model used by default wasn't good enough to follow the strict formatting instructions used in the full pipeline. We could try asking it to generate the question and answer separately, but presumably that doubles the inference API calls, which doesn't sound great for the resource constrained environments the simple config is aimed at. Signed-off-by: Russell Bryant <[email protected]>
- Loading branch information