GitHub - rashadmoarref/spark-demo: A demo project on leveraging spark structured streaming for machine learning inference

Description

This demo intends to provide an end-to-end example of deploying an NLP inference project using Apache Spark Structured Streaming. The setup includes ready-to-use local deployment for testing and cloud deployment on Databricks using Drone. Please also see accompanying Medium blog post. For the purpose of this demo, we use spaCy's open-source Named Entity Recognition model.

To clone the repo, run git clone https://github.com/rashadmoarref/spark-demo.git

Deploy locally

create local demo network to be used by kafka and app docker containers

docker network create demo

create kafka service

make kafka

run app

make app

Note: use make app FORCE=true if need to re-build the app after changing Dockerfile or reqiurements.txt

send input to app's input topic

make produce-input FILE=input.json

consume from app's result topic

make consume-result

tear down local deployment

make app-down
make kafka-down
docker network rm demo

Note: If running into memory issues, increase the allocated memory of your Docker Engine to 10GB.

Deploy on Databricks using Drone

drone steps are defined in .drone.yml
deployment on databricks is handled using REST API in deploy/databricks/deploy-job.sh

Setup `pre-commit` if you like to contribute

The configs are defined in .pre-commit-config.yaml
black and flake8 are tools for enforcing code style

pip install black flake8 pre-commit
pre-commit install

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
deploy		deploy
test		test
.drone.yml		.drone.yml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Deploy locally

Deploy on Databricks using Drone

Setup `pre-commit` if you like to contribute

About

Releases

Packages

Languages

rashadmoarref/spark-demo

Folders and files

Latest commit

History

Repository files navigation

Description

Deploy locally

Deploy on Databricks using Drone

Setup pre-commit if you like to contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Setup `pre-commit` if you like to contribute

Packages