Skip to content

Commit

Permalink
feat: added phi-2 model
Browse files Browse the repository at this point in the history
  • Loading branch information
limcheekin committed Dec 16, 2023
1 parent 5152221 commit 0eb7a68
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 14 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Deploy Dev
on:
push:
branches:
- zephyr-7b
- phi-2
jobs:
deploy-dev:
runs-on: ubuntu-latest
Expand All @@ -27,4 +27,4 @@ jobs:
run: cd ${{ github.ref_name }};modal deploy fastapi_app.py

- name: Test FastAPI app
run: "pwd;cd ${{ github.ref_name }};curl -X POST -H 'Content-Type: application/json' -d @prompt.json ${{ secrets.ZEPHYR_7B_APP_URL }}v1/completions"
run: "pwd;cd ${{ github.ref_name }};curl -X POST -H 'Content-Type: application/json' -d @prompt.json ${{ secrets.PHI_2_APP_URL }}v1/completions"
2 changes: 1 addition & 1 deletion zephyr-7b/Dockerfile → phi-2/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ RUN pip install -U pip setuptools wheel && \

# Download model
RUN mkdir model && \
curl -L https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q6_K.gguf -o model/gguf-model.bin
curl -L https://huggingface.co/kroonen/phi-2-GGUF/resolve/main/phi-2_Q8_0.gguf -o model/gguf-model.bin
8 changes: 4 additions & 4 deletions zephyr-7b/fastapi_app.py → phi-2/fastapi_app.py
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
# Modal Lab web app for llama.cpp.
from modal import Image, Stub, asgi_app

stub = Stub("zephyr-7b")
stub = Stub("phi-2")

image = Image.from_dockerfile(
"Dockerfile", force_build=True
).pip_install("pydantic_settings").pip_install("fastapi==0.103.1").run_commands(
).pip_install("pydantic_settings").pip_install("fastapi==0.105.0").run_commands(
# Fix: Cannot allocate memory. Try increasing RLIMIT_MLOCK ('ulimit -l' as root).
'echo "* soft memlock unlimited" >> /etc/security/limits.conf && echo "* hard memlock unlimited" >> /etc/security/limits.conf',
)


@stub.function(image=image, cpu=14, memory=8704, keep_warm=1, timeout=600)
@stub.function(image=image, cpu=4, memory=5632, timeout=600)
@asgi_app()
def fastapi_app():
from llama_cpp.server.app import create_app, Settings
import os
print("os.cpu_count()", os.cpu_count())
app = create_app(
Settings(
n_threads=14,
n_threads=4,
model="/model/gguf-model.bin",
embedding=False
)
Expand Down
7 changes: 7 additions & 0 deletions phi-2/prompt.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"prompt": [
"Instruct: You are an AI assistant that follows instruction extremely well. Help as much as you can. Answer the question based on the context below.\nContext: The main benefit of operators is to automate operations. Kubernetes operators are capable to automate the expensive and error likely human operations. Features like autopilot and self-healing are typical scenarios. Another benefit of operators is the reusability of software. Software providers can expose operators in various catalogs to reach new markets and to promote their software. Operators leverage the Kubernetes community, since they are a natural and Kubernetes-native way to extend Kubernetes.\nQuestion: What are the main benefits of Kubernetes Operators?\nOutput:"
],
"max_tokens": 128,
"stop": []
}
7 changes: 0 additions & 7 deletions zephyr-7b/prompt.json

This file was deleted.

0 comments on commit 0eb7a68

Please sign in to comment.