Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding alpaca data 52k data from stanford alpaca project #32

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions data/alpaca_52k.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
pretty_name: alpaca_data_52k
license:
- cc-by-4.0
language:
- en
multilinguality:
- monolingual
download_link:
- https://huggingface.co/datasets/joecodecreations/alpaca_data_52k/resolve/main/alpaca_data.jsonl
source:
- https://huggingface.co/datasets/joecodecreations/alpaca_data_52k
task_types:
- instruction-tuning
description:
- "dialogue instruction-tuning / instruction following; example: <human>: xxxx\n<bot>: yyyy, Data is from stanford_alpaca project, data is for fine-tuning instruction-following where data generated by the techniques in Self-Instruct: Aligning Language Model with Self Generated Instructions. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. https://arxiv.org/abs/2212.10560"
processed_by:
- Joey Sanchez (https://github.com/joecodecreations)