Docker images for Open Source bigdata/hadoop projects

Flokkr is an umbrella github organization to collect all of my containerization work for Apache bigdata/datascience projects such as Apache Hadoop or Apache Spark.

On high level, there are two main type of the subprojects/git repos under this organization: Containers and runtime configuration examples.

If you would like to run a simple Apache bigdata project, open the repository and use the included docker-compose file. If you need a more sophisticated cluster which includes multiple product and different configuration: investigate the runtime repositories and choose a method which is the most appropriate for you.

Containers

All of the containers are based on one smart baseimage defined in flokkr/docker-baseimage. It contains all the configuration loading script (based on environment variables or consul servers) and other extensions (eg. btrace instrumentation).

To get more information about the available environment variables check the flokkr/launcher repository.

All the other containers can be found with docker- prefix under the flokkr organization.

The containers are usually built on travis-ci and pushed to the docker hub instead to use dockerhub automatic buidls due to the limitation of the dockerhub (for example it's hard to generate matrix builds with all the older versions).

Available images:

Repository	Product
docker-baseimage	Base image with all the configuration loading magic
docker-hadoop	Apache Hadoop components (hdfs/yarn)
docker-spark	Apache Spark components
docker-storm	Apache Storm components
docker-zookeeper	Apache Zookeeper components
docker-kafka	Apache Kafka components
docker-hbase	Apache HBase components
docker-zeppelin	Apache Zeppelin interface
docker-krb5	Highly insecure kerberos container, with an open REST api to request new kerberos keytab files.

Note: previous version of the containers (and some not yet migrated) can be found under the github.com/elek account.

Runtime examples

Docker image creation is easy, just a few lines to download and unpack the Apache projects. The tricky part is how the containers could work together: service discovery, configuration management, data locality, multi-tenancy, etc.

There are various examples how the containers could be used and each of them have a separated repository with the runtime- prefix.

Repository	Details
runtime-compose	docker-composed based pseudo clusters (multiple containers but only for one hosts). Configuration are defined by environment variables. For development and local experiments.
runtime-consul	Multi-host real cluster with consul (for storing the configuration and docker-compose definitions) and docker-compose. Small scripts help to maintain the cluster state (restart components on every config change). Full data-locality is achieved by using docker host network.
runtime-nomad	Multi-host real cluster with consul (for storing the configuration and docker-compose definitions) and nomad (to start the instances). Small scripts help to maintain the cluster state (restart components on every config change). Full data-locality is achieved by using docker host network.
runtime-swarm	Similar to the previous one, but the container scheduling part is simplified with docker-compose + swarm. No host network, so no data-locality. Environment variable based configuration management.
runtime-kubernetes	Kubernetes managed cluster with kubernetes ConfigMap based configuration set.

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
infra-playbooks		infra-playbooks
simple		simple
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docker images for Open Source bigdata/hadoop projects

Containers

Runtime examples

About

Releases

Packages

Languages

License

jaikumarAJ/flokkr

Folders and files

Latest commit

History

Repository files navigation

Docker images for Open Source bigdata/hadoop projects

Containers

Runtime examples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages