-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
23 changed files
with
102,246 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
__pycache__/ | ||
*.pyc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
data/embedder.pt filter=lfs diff=lfs merge=lfs -text | ||
data/intent_classifier.pt filter=lfs diff=lfs merge=lfs -text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
__pycache__/ | ||
*.pyc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# If using GPU replace the following line with: | ||
# FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 | ||
FROM python | ||
|
||
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y | ||
ENV PATH="/root/.cargo/bin:${PATH}" | ||
|
||
RUN python3 -m pip install --upgrade pip | ||
COPY requirements.txt . | ||
RUN python3 -m pip install -r requirements.txt | ||
|
||
RUN mkdir /sova-nlu | ||
WORKDIR /sova-nlu |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# SOVA NLU | ||
|
||
SOVA NLU is an intent classification solution based on [BERT](https://arxiv.org/abs/1810.04805) architecture. It is designed as a REST API service and it can be customized (both code and models) for your needs. | ||
|
||
## Installation | ||
|
||
The easiest way to deploy the service is via docker-compose, so you have to install Docker and docker-compose first. Here's a brief instruction for Ubuntu: | ||
|
||
#### Docker installation | ||
|
||
* Install Docker: | ||
```bash | ||
sudo apt-get update | ||
sudo apt-get install \ | ||
apt-transport-https \ | ||
ca-certificates \ | ||
curl \ | ||
gnupg-agent \ | ||
software-properties-common | ||
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - | ||
sudo apt-key fingerprint 0EBFCD88 | ||
sudo add-apt-repository \ | ||
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \ | ||
$(lsb_release -cs) \ | ||
stable" | ||
sudo apt-get update | ||
sudo apt-get install docker-ce docker-ce-cli containerd.io | ||
sudo usermod -aG docker $(whoami) | ||
``` | ||
In order to run docker commands without sudo you might need to relogin. | ||
* Install docker-compose: | ||
``` | ||
sudo curl -L "https://github.com/docker/compose/releases/download/1.25.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose | ||
sudo chmod +x /usr/local/bin/docker-compose | ||
``` | ||
|
||
* (Optional) If you're planning on using CUDA run these commands: | ||
``` | ||
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \ | ||
sudo apt-key add - | ||
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) | ||
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \ | ||
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list | ||
sudo apt-get update | ||
sudo apt-get install nvidia-container-runtime | ||
``` | ||
Add the following content to the file **/etc/docker/daemon.json**: | ||
```json | ||
{ | ||
"runtimes": { | ||
"nvidia": { | ||
"path": "nvidia-container-runtime", | ||
"runtimeArgs": [] | ||
} | ||
}, | ||
"default-runtime": "nvidia" | ||
} | ||
``` | ||
Restart the service: | ||
```bash | ||
sudo systemctl restart docker.service | ||
``` | ||
|
||
#### Build and deploy | ||
|
||
* Build docker image | ||
* If you're planning on using CPU only: build *sova-nlu* image using the following command: | ||
```bash | ||
sudo docker-compose build | ||
``` | ||
* If you're planning on using GPU: modify `Dockerfile`, `docker-compose.yml` (uncomment the runtime and environment sections) and build *sova-nlu* image: | ||
```bash | ||
sudo docker-compose build | ||
``` | ||
* Run web service in a docker container | ||
```bash | ||
sudo docker-compose up -d sova-nlu | ||
``` | ||
## Testing | ||
To test the service you can send a POST request: | ||
```bash | ||
curl --request POST 'http://localhost:8000/get_intent' --header "Content-Type: application/json" --data '{"text": "Включи режиссерскую версию Лиги справедливости"}' | ||
``` | ||
You can also use web interface by opening http://localhost:8000/docs. | ||
## Training | ||
* Use the same Docker image that was already built for the service. Customize the hyperparameters in `config.py` and use your own `data/data.csv` for training. | ||
* To start training simply run: | ||
```bash | ||
sudo docker-compose up -d sova-nlu-train | ||
``` | ||
The trained model will be saved to `data/intent_classifier.pt`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# -*- coding: utf-8 -*- | ||
from pipeline import IntentPipeline | ||
from fastapi import FastAPI | ||
from pydantic import BaseModel | ||
from typing import Union | ||
|
||
|
||
class Item(BaseModel): | ||
text: str | ||
threshold: Union[float, float] = 0.01 | ||
|
||
|
||
pipeline = IntentPipeline() | ||
app = FastAPI() | ||
|
||
|
||
@app.post("/intent") | ||
async def intent_classification(item: Item): | ||
return pipeline(item.text, item.threshold) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
DATA = 'data/data.csv' | ||
|
||
LEARNING_RATE = 0.1 | ||
MOMENTUM = 0.8 | ||
BATCH_SIZE = 128 | ||
EPOCHS = 1000 | ||
|
||
ONE_HOT_LABELS = { | ||
'Скажи время': 0, 'Заведи будильник': 1, 'Заказ билетов на поезд': 2, | ||
'Выключи свет': 3, 'Включи музыку': 4, 'Включи фильм': 5, | ||
'Включи свет': 6, 'Хаха': 7, 'Ещё': 8, 'Какие новости?': 9, | ||
'Нет': 10, 'Да': 11, 'Как дела? - Плохо': 12, | ||
'Как дела? - Хорошо': 13, 'Спасибо': 14, 'Можно спросить?': 15, | ||
'Как дела?': 16, 'Как меня зовут': 17, 'Как тебя зовут?': 18, | ||
'Какая погода?': 19, 'Пока': 20, 'Привет': 21 | ||
} |
Oops, something went wrong.