Skip to content

Commit

Permalink
[GSoC Proposal] ScienceBox Monitoring, SWAN, CERN (HSF#694)
Browse files Browse the repository at this point in the history
* [GSoC Proposal] ScienceBox Monitoring, SWAN, CERN

* ScienceBoxMonitor: Fix logo

* ScienceBoxMonitor: move proposal to _gsocproposals/2020
  • Loading branch information
Enrico Bocchi authored Feb 5, 2020
1 parent 648a4a2 commit 90ee878
Show file tree
Hide file tree
Showing 4 changed files with 58 additions and 0 deletions.
11 changes: 11 additions & 0 deletions _gsocprojects/2020/project_SWAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
project: SWAN
layout: default
logo: SWAN-logo.png
description: |
[SWAN](https://cern.ch/swan) (Service for Web-based ANalysis) is a CERN service that allows users to perform interactive data analysis in the cloud, in a "software as a service" model. It is built upon the widely-used Jupyter notebooks, allowing users to write - and run - their data analysis using only a web browser. By connecting to SWAN, users have immediate access to storage, software and computing resources (like Spark clusters) that CERN provides, and that they need to do their analyses.
summary: |
SWAN is a CERN service that allows users to perform interactive data analysis in the cloud, built upon the widely-used Jupyter notebooks and CERN technologies for storage and software access.
---

{% include gsoc_project.ext %}
45 changes: 45 additions & 0 deletions _gsocproposals/2020/proposal_ScienceBoxMonitor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: Health Checks and Monitoring Dashboard for ScienceBox
project: SWAN
layout: gsoc_proposal
year: 2020
organization: CERN
---

## Description

[ScienceBox](https://sciencebox.web.cern.ch/) is a comprehensive set of services for cloud storage and computing applications suitable for both general-purpose use cases and advanced scientific scenarios.
It provides [EOS](https://eos.web.cern.ch/), the CERN software for massive distributed storage in the cloud, [CERNBox](http://cernbox.web.cern.ch), the cloud storage, synchronization and sharing service for science, and [SWAN](https://swan.web.cern.ch), a fully-fledged interactive data analysis platform accessible from a web browser.
ScienceBox is available in two flavors:
1) A one-click setup for demonstration purposes running on a single machine with docker-compose, and
2) A production-oriented deployment with the ability to scale out according to storage and computing needs based on Kubernetes.


## Task ideas

This project focuses on:
* Implementing health checks for ScienceBox services running in containers. This ranges from basic probing (e.g., check network sockets/running processes, issue HTTP requests) to the development of custom scripts for functional tests verifying the expected behavior of services.
* Managing application-level logs through sidecar containers running log parsing and ingestion agents (e.g., [Apache Flume](https://flume.apache.org/)). Ingested metrics should be collected on a centralized storage backend (e.g., [Elastic Search](https://www.elastic.co/products/elasticsearch) or [InfluxDB](https://www.influxdata.com/products/influxdb-overview/)).
* Exploring collected metrics interactively (e.g., via [Timber](https://timber.io/) or [Kibana] (https://www.elastic.co/products/kibana)) and creating visualization dashboards (e.g., time-series, heatmaps in [Grafana] (https://grafana.com/) for ScienceBox administrators
* Complementing the set of containers provided by ScienceBox with the ones requires for log ingestion, storage, and processing/visualization.

It is encouraged to re-use widely adopted technologies as much as possible.


## Expected results
* A working implementation of health-checks and sidecar containers for log ingestions.
* An updated version of ScienceBox with additional containers required to run dedicated services for log inspection and visualization.

## Requirements
* Python, Shell scripting
* Basic understanding of file systems, HTTP protocol, and process monitoring
* Experience with Docker containers and Kubernetes
* Experience with Helm charts is a plus

## Mentors
* [Enrico Bocchi](mailto:[email protected])
* [Diogo Castro](mailto:[email protected])

## Links
* [SWAN](https://swan.web.cern.ch/)
* [ScienceBox](https://sciencebox.web.cern.ch/)
2 changes: 2 additions & 0 deletions gsoc/2020/mentors.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ layout: plain
* Riccardo Maria Bianchi [[email protected]](mailto:[email protected]) University of Pittsburgh
* Jakob Blomer [[email protected]](mailto:[email protected]) CERN
* Ken Bloom [[email protected]](mailto:[email protected]) University of Nebraska-Lincoln
* Enrico Bocchi [[email protected]](mailto:[email protected]) CERN
* Brian Bockelman [[email protected]](mailto:[email protected]) Morgridge Institute for Research
* Andy Buckley [[email protected]](mailto:[email protected]) University of Glasgow
* Jon Butterworth [[email protected]](mailto:[email protected]) UCL
* Yvan Calas [[email protected]](mailto:[email protected]) CC-IN2P3
* Philippe Canal [[email protected]](mailto:[email protected]) Fermilab
* Diogo Castro [[email protected]](mailto:[email protected]) CERN
* Vasco Chibante Barroso [[email protected]](mailto:[email protected]) CERN
* Louie Corpe [[email protected]](mailto:[email protected]) UCL
* Ulrik Egede [[email protected]](mailto:[email protected]) Monash University
Expand Down
Binary file added images/SWAN-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 90ee878

Please sign in to comment.