Initial commit

sovaai · Aug 17, 2022 · 72b1d19 · 72b1d19
1 parent 7d0b66a
commit 72b1d19
Show file tree

Hide file tree

Showing 23 changed files with 102,246 additions and 0 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,2 @@
+__pycache__/
+*.pyc
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1,2 @@
+data/embedder.pt filter=lfs diff=lfs merge=lfs -text
+data/intent_classifier.pt filter=lfs diff=lfs merge=lfs -text
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,2 @@
+__pycache__/
+*.pyc
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,13 @@
+# If using GPU replace the following line with:
+# FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04
+FROM python
+
+RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
+ENV PATH="/root/.cargo/bin:${PATH}"
+
+RUN python3 -m pip install --upgrade pip
+COPY requirements.txt .
+RUN python3 -m pip install -r requirements.txt
+
+RUN mkdir /sova-nlu
+WORKDIR /sova-nlu
diff --git a/README.md b/README.md
@@ -0,0 +1,97 @@
+# SOVA NLU
+
+SOVA NLU is an intent classification solution based on [BERT](https://arxiv.org/abs/1810.04805) architecture. It is designed as a REST API service and it can be customized (both code and models) for your needs.
+
+## Installation
+
+The easiest way to deploy the service is via docker-compose, so you have to install Docker and docker-compose first. Here's a brief instruction for Ubuntu:
+
+#### Docker installation
+
+*	Install Docker:
+```bash
+sudo apt-get update
+sudo apt-get install \
+    apt-transport-https \
+    ca-certificates \
+    curl \
+    gnupg-agent \
+    software-properties-common
+curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
+sudo apt-key fingerprint 0EBFCD88
+sudo add-apt-repository \
+   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
+   $(lsb_release -cs) \
+   stable"
+sudo apt-get update
+sudo apt-get install docker-ce docker-ce-cli containerd.io
+sudo usermod -aG docker $(whoami)
+```
+In order to run docker commands without sudo you might need to relogin.
+*   Install docker-compose:
+```
+sudo curl -L "https://github.com/docker/compose/releases/download/1.25.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
+sudo chmod +x /usr/local/bin/docker-compose
+```
+
+*   (Optional) If you're planning on using CUDA run these commands:
+```
+curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
+  sudo apt-key add -
+distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
+curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
+  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
+sudo apt-get update
+sudo apt-get install nvidia-container-runtime
+```
+Add the following content to the file **/etc/docker/daemon.json**:
+```json
+{
+    "runtimes": {
+        "nvidia": {
+            "path": "nvidia-container-runtime",
+            "runtimeArgs": []
+        }
+    },
+    "default-runtime": "nvidia"
+}
+```
+Restart the service:
+```bash
+sudo systemctl restart docker.service
+``` 
+
+#### Build and deploy
+
+*   Build docker image
+     *   If you're planning on using CPU only: build *sova-nlu* image using the following command:
+     ```bash
+     sudo docker-compose build
+     ```
+     *   If you're planning on using GPU: modify `Dockerfile`, `docker-compose.yml` (uncomment the runtime and environment sections) and build *sova-nlu* image:
+     ```bash
+     sudo docker-compose build
+     ```
+
+*   Run web service in a docker container
+     ```bash
+     sudo docker-compose up -d sova-nlu
+     ```
+
+## Testing
+
+To test the service you can send a POST request:
+```bash
+curl --request POST 'http://localhost:8000/get_intent' --header "Content-Type: application/json" --data '{"text": "Включи режиссерскую версию Лиги справедливости"}'
+```
+
+You can also use web interface by opening http://localhost:8000/docs.
+
+## Training
+
+*   Use the same Docker image that was already built for the service. Customize the hyperparameters in `config.py` and use your own `data/data.csv` for training.
+*   To start training simply run:
+     ```bash
+     sudo docker-compose up -d sova-nlu-train
+     ```
+The trained model will be saved to `data/intent_classifier.pt`.
diff --git a/app.py b/app.py
@@ -0,0 +1,19 @@
+# -*- coding: utf-8 -*-
+from pipeline import IntentPipeline
+from fastapi import FastAPI
+from pydantic import BaseModel
+from typing import Union
+
+
+class Item(BaseModel):
+    text: str
+    threshold: Union[float, float] = 0.01
+
+
+pipeline = IntentPipeline()
+app = FastAPI()
+
+
+@app.post("/intent")
+async def intent_classification(item: Item):
+    return pipeline(item.text, item.threshold)
diff --git a/config.py b/config.py
@@ -0,0 +1,16 @@
+DATA = 'data/data.csv'
+
+LEARNING_RATE = 0.1
+MOMENTUM = 0.8
+BATCH_SIZE = 128
+EPOCHS = 1000
+
+ONE_HOT_LABELS = {
+        'Скажи время': 0, 'Заведи будильник': 1, 'Заказ билетов на поезд': 2, 
+        'Выключи свет': 3, 'Включи музыку': 4, 'Включи фильм': 5,
+        'Включи свет': 6, 'Хаха': 7, 'Ещё': 8, 'Какие новости?': 9, 
+        'Нет': 10, 'Да': 11, 'Как дела? - Плохо': 12, 
+        'Как дела? - Хорошо': 13, 'Спасибо': 14, 'Можно спросить?': 15, 
+        'Как дела?': 16, 'Как меня зовут': 17, 'Как тебя зовут?': 18, 
+        'Какая погода?': 19, 'Пока': 20, 'Привет': 21
+}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		data/embedder.pt filter=lfs diff=lfs merge=lfs -text
		data/intent_classifier.pt filter=lfs diff=lfs merge=lfs -text