Data Labeling, curation, and Inference Store
Designed for MLOps & Feedback Loops
🆕 🔥 Play with Argilla UI with this live-demo powered by Hugging Face Spaces ( login:
argilla
, password:1234
)
🆕 🔥 Since
1.2.0
Argilla supports vector search for finding the most similar records to a given one. This feature uses vector or semantic search combined with more traditional search (keyword and filter based). Learn more on this deep-dive guide
- Programmatic labeling using weak supervision. Built-in label models (Snorkel, Flyingsquid)
- Bulk-labeling and search-driven annotation
- Iterate on training data with any pre-trained model or library
- Efficiently review and refine annotations in the UI and with Python
- Use Argilla built-in metrics and methods for finding label and data errors (e.g., cleanlab)
- Simple integration with active learning workflows
- Close the gap between production data and data collection activities
- Auto-monitoring for major NLP libraries and pipelines (spaCy, Hugging Face, FlairNLP)
- ASGI middleware for HTTP endpoints
- Argilla Metrics to understand data and model issues, like entity consistency for NER models
- Integrated with Kibana for custom dashboards
- Bring different users and roles into the NLP data and model lifecycles
- Organize data collection, review and monitoring into different workspaces
- Manage workspace access for different users
Argilla is composed of a Python Server with Elasticsearch as the database layer, and a Python Client to create and manage datasets.
To get started you just need to run the docker image with following command:
docker run -d --name quickstart -p 6900:6900 argilla/argilla-quickstart:latest
This will run the latest quickstart docker image with 2 users argilla
and team
. The password for these users is
1234
. You can also configure these environment variables as per you needs.
ARGILLA_API_KEY
: Argilla provides a Python library to interact with the app (read, write, and update data, log model predictions, etc.). If you don't set this variable, the library and your app will use the default API key i.e.argilla.apikey
. If you want to secure your app for reading and writing data, we recommend you to set up this variable. The API key you choose can be any string of your choice and you can check an online generator if you like.ARGILLA_PASSWORD
: This sets a custom password for login into the app with theargilla
username. The default password is1234
. By setting up a custom password you can use your own password to login into the app.TEAM_API_KEY
: This sets the root user's API key. The API key you choose can be any string of your choice and you can check an online generator if you like. The default api key isteam.apikey
.TEAM_PASSWORD
: This sets a custom password for login into the app with theargilla
username. The default password is1234
. By setting up a custom password you can use your own password to login into the app.LOAD_DATASETS
: This variables will allow you to load sample datasets. The default value will befull
. The supported values for this variable is as follows:single
: Load single datasets for TextClassification task.full
: Load all the sample datasets for NLP tasks (TokenClassification, TextClassification, Text2Text)none
: No datasets being loaded.