Project Description

This project showcases a local deployment using Kubernetes with Minikube to simulate a microservices-based architecture. The environment is structured to demonstrate how various components, such as API gateways, Kafka clusters, and Flink consumers, can be orchestrated to work together for real-time data processing.

Key Components:

API Gateway: Acts as an entry point for user requests, balancing the load across multiple backend API instances.
Load Balancer: Distributes traffic evenly across multiple backend API instances to ensure high availability and fault tolerance.
Backend API: A microservice responsible for handling business logic, producing events to Kafka, and interacting with other services.
Redis Cache: Used as an in-memory data store to optimize performance by caching frequently accessed data, reducing load on the backend services and databases.
Kafka Cluster: Handles event-driven communication, producing and consuming messages through topics like mock-user-topic.
Flink Consumer: Ingests data from Kafka, performing ETL (Extract, Transform, Load) tasks and pushing the processed data to a Data Warehouse.
Node Consumer: Ingests data from Kafka, performing ETL (Extract, Transform, Load) tasks and pushing the processed data to a Data Warehouse.
Data Warehouse (DW): Stores processed data, making it accessible for analysis and reporting.
Monitoring and Logging: Services such as Prometheus and Grafana are used to track system health, performance metrics, and logs.
Data Dictionary: Present on the data-dictionary directory which you can execute it locally.
Scripts: The project contains scripts to populate the oracle DW, and scripts to backup and load the oracle database.

Local Environment:

Minikube is used to run this Kubernetes setup locally for demonstration purposes. Minikube allows all services to run within a single-node cluster, emulating a real-world microservices architecture. The environment is configured with several deployments, including Kafka, Flink, Oracle DB, Redis, Prometheus, and Redis, to mimic a distributed system.

Future Expansion:

If this were a production-ready setup, the architecture would scale across multiple Kubernetes clusters. The Master Node would manage worker nodes, ensuring scalability, high availability, and failover capabilities. Using a more distributed architecture would ensure that the platform can handle real-world production loads with increased reliability and flexibility.

Starting Minikube

  minikube start --cpus 5 --memory 9192 --disk-size=50g --driver=docker
  minikube dashboard
  docker context ls
  //eval $(minikube -p minikube docker-env)

Publishing everything to the minikube for the first time

  kubectl apply -f .\kubernetes\deployment\zookeper-deployment.yaml
  kubectl apply -f .\kubernetes\deployment\redis-deployment.yaml
  kubectl apply -f .\kubernetes\deployment\kafka-deployment.yaml
  # kubectl apply -f .\kubernetes\jobs\kafka-topics-job.yaml -- not required, since the producers/consumers will create them automatically

  # build & deploy the node-backend-api image
  docker build -t node-backend-api:latest ./node-backend-api
  minikube image load node-backend-api:latest
  kubectl apply -f .\kubernetes\deployment\node-backend-api-deployment.yaml

  # build & deploy the data warehouse image
  kubectl apply -f .\kubernetes\deployment\oracle-db-deployment.yaml
  docker build -t data-warehouse-app:latest .\datawarehouse\
  minikube image load data-warehouse-app:latest
  kubectl apply -f .\kubernetes\jobs\data-warehouse-job.yaml

  # build & deploy the flink-consumer image
  # docker build -t flink-consumer:latest ./flink-consumer
  # minikube image load flink-consumer:latest
  # kubectl apply -f .\kubernetes\deployment\flink-consumer-deployment.yaml

  # build & deploy the node-consumer image
  docker build -t node-consumer:latest ./node-consumer
  minikube image load node-consumer:latest
  kubectl apply -f .\kubernetes\deployment\node-consumer-deployment.yaml

  # deploy cmac
  kubectl apply -f .\kubernetes\deployment\cmac-deployment.yaml

  # build & deploy the gateway
  docker build -t gateway-app:latest .\Gateway\Gateway
  minikube image load gateway-app:latest
  kubectl apply -f .\kubernetes\deployment\gateway-deployment.yaml

  # other stuff
  #kubectl apply -f .\prometheus-deployment.yaml
  #kubectl apply -f .\grafana-deployment.yaml

Minikube useful comands

  # list pods
  kubectl get pods
  # fowarding one local port to one pod:
  kubectl port-forward <pod-name> 3001:3001

Connecting in PowerBI

User: jorgermduarte pass: 123456 server: localhost:1521/jorgermduarte

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.ignore		.ignore
Gateway		Gateway
data-dictionary		data-dictionary
datawarehouse		datawarehouse
documentation		documentation
flink-consumer		flink-consumer
kubernetes		kubernetes
node-backend-api		node-backend-api
node-consumer		node-consumer
prometheus		prometheus
queries		queries
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SITRABALHO2JORGEDUARTE.pbix		SITRABALHO2JORGEDUARTE.pbix
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Description

Key Components:

Local Environment:

Future Expansion:

Starting Minikube

Publishing everything to the minikube for the first time

Minikube useful comands

Connecting in PowerBI

About

Releases

Packages

Languages

License

jorgermduarte/real-time-data-architecture-kafka-flink-dw-k8s

Folders and files

Latest commit

History

Repository files navigation

Project Description

Key Components:

Local Environment:

Future Expansion:

Starting Minikube

Publishing everything to the minikube for the first time

Minikube useful comands

Connecting in PowerBI

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages