website: docs rewrite

ohsu-comp-bio · Nov 15, 2017 · a24327e · a24327e
1 parent cd6fbd9
commit a24327e
Show file tree

Hide file tree

Showing 45 changed files with 1,868 additions and 573 deletions.
diff --git a/website/config.yaml b/website/config.yaml
@@ -2,10 +2,10 @@ baseURL: https://ohsu-comp-bio.github.io/funnel/
 canonifyURLs: true
 languageCode: en-us
 title: Funnel
-
+publishDir: docs
 menu:
   main:
-
-  - name: Reference
-    url: https://godoc.org/github.com/ohsu-comp-bio/funnel
-    weight: 30
+    - name: Reference
+      parent: Development
+      url: https://godoc.org/github.com/ohsu-comp-bio/funnel
+      weight: 30
diff --git a/website/content/_index.md b/website/content/_index.md
@@ -1,62 +1,11 @@
 ---
 Demo:
-  - Title: Start Funnel
-    Cmd: $ funnel server run
-
   - Title: Run a task
     Desc: Returns a task ID.
     Cmd: |
       $ funnel run 'md5sum $src' -c ubuntu --in src=~/src.txt
       b41pkv2rl6qjf441avd0
 
-  - Title: Get the task
-    Desc: Returns state, logs, and more.
-    Cmd: $ funnel task get b41pkv2rl6qjf441avd0
-
-  - Title: List all the tasks
-    Cmd: $ funnel task list
-
-  - Title: View the terminal dashboard
-    Cmd: $ funnel dashboard
-
-# - Title: Move to the cloud.
-#   Desc: |
-#     Google, Amazon, Microsoft, HPC, and more.
-#   Cmd: |
-#     $ gcloud auth login
-#     $ funnel deploy gce
-#     $ funnel run 'md5sum'                \
-#         --stdin gs://pub/input.txt       \
-#         --stdout gs://my-bkt/output.txt
-
-  - Title: Use a remote server
-    Cmd: $ funnel run --server http://funnel.example.com ...
-
-  - Title: Example tasks
-    Cmd: |
-      $ funnel example list
-      $ funnel example hello-world
-
-  - Title: Get help
-#   Desc: The Funnel CLI is extensive.
-    Cmd: $ funnel help
-
-#  - Title: File a bug.
-#    Desc: It happens.
-#    Cmd: $ funnel bug
-
-  - Title: Get the code
-    Cmd: $ go get github.com/ohsu-comp-bio/funnel
-
-#  - Title: Hack together a workflow.
-#    Desc: Bash-fu. Hadouken!
-#    Cmd: |
-#      $ funnel run <<TPL
-#      TPL
-
-#  - Title: Use a workflow language.
-#    Desc: Level up with CWL and WDL.
-
 ---
 
 Homepage content is written in layouts/index.html
diff --git a/website/content/docs.md b/website/content/docs.md
@@ -3,62 +3,80 @@ title: Overview
 menu:
   main:
     identifier: docs
-    weight: -100
+    weight: -1000
 ---
 
 # Overview
 
-Funnel aims to make batch processing tasks easier to manage by providing a simple
-toolkit that supports a wide variety of cluster types. Our goal is to enable you
-to spend less time worrying about task management and more time processing data.
+Funnel makes distributed, batch processing easier by providing a simple task API and a set of
+components which can easily adapted to a vareity of platforms.
 
-## Background
+### Task
 
-### How Does Funnel Work?
+A task defines a unit of work: metadata, input files to download, a sequence of Docker containers + commands to run,
+output files to upload, state, and logs. The API allows you to create, get, list, and cancel tasks.
 
-Funnel is a combination of a server and worker processes. First, you define a task.
-A task describes input and output files, (Docker) containers and commands, resource
-requirements, and some other metadata. You send that task to the Funnel server,
-which puts it in the queue until a worker is available. When an appropriate Funnel
-worker is available, it downloads the inputs, executes the commands in (Docker)
-containers, and uploads the outputs.
+Tasks are accessed via the `funnel task` command. There's an HTTP client in the [client package][clientpkg],
+and a set of utilities and a gRPC client in the [proto/tes package][tespkg].
 
-Funnel also comes with some tools related to managing workers and tasks. There's
-a dashboard, a scheduler, an autoscaler, some rudimentary workflow tools, and more.
+There's a lot more you can do with the task API. See the [tasks docs](/docs/tasks/) for more.
 
-### Why Does Funnel Exist?
+### Server
 
-Here at OHSU Computational Biology, a typical project involves coordinating dozens
-of tasks across hundreds of CPUs in a cluster of machines in order to process hundreds
-of files. That's standard fare for most computational groups these days, and for some
-groups it's "thousands" or "millions" instead of "hundreds".
+The server serves the task API, web dashboard, and optionally runs a task scheduler.
+It serves both HTTP/JSON and gRPC/Protobuf.
 
-Because we're part of a worldwide scientific community, it's important that we're able
-to easily share our work. If we create a variant calling pipeline with 50 steps,
-we need people outside OHSU to run that pipeline easily and efficiently.
+The server is accessible via the `funnel server` command and the [server package][serverpkg].
 
-There's a long list of projects making great strides in the tools we use to tackle
-this type of work, but they have a common problem. Every group of users has grown
-a different set of tools for managing and interacting with their cluster. Some use
-HTCondor and NFS. Some use Open Grid Engine and Lustre. Some prefer cloud providers,
-but which one? Google? Amazon? Each cluster comes with a different interface to learn
-(and a new set of problems to debug too).
+### Storage
 
-Tool authors usually end up writing (and hopefully maintaining) a set of
-compute and storage plugins for each type of cluster. Many authors don't have
-time for that, and their tools end up being limited to their environment.
-Some tools were never meant to be shared, instead they were originally just
-a prototype or a set of helper scripts for working with AWS instances.
+Storage provides access to file systems such as S3, Google Storage, and local filesystems.
+Tasks define locations where files should be downloaded from and uploaded to. Workers handle
+the downloading/uploading.
 
-The [GA4GH Task Execution Schemas][tes] (TES) group aims to ease problems by
-designing a simple API for data processing tasks that can be easily layered on top of,
-or easily plugged into, most existing cluster. Funnel started as the first
-implementation of the TES API.
+See the [storage docs](/docs/storage/) for more information on configuring storage backends.
+The storage clients are available in the [storage package][storagepkg].
 
-Funnel aims to ease these problems. Our goal is to enable easy management of tasks
-and tools that need to work across many types of clusters.
+### Worker
+
+A worker is reponsible for executing a task. There is one worker per task. A worker:
+
+- downloads the inputs
+- runs the sequence of executors (usually via Docker)
+- uploads the outputs
+
+Along the way, the worker writes logs to event streams and databases:
+
+- start/end time
+- state changes (initializing, running, error, etc)
+- executor start/end times
+- executor exit codes
+- executor stdout/err logs
+- a list of output files uploaded, with sizes
+- system logs, such as host name, docker command, system error messages, etc.
+
+The worker is accessible via the `funnel worker` command and the [worker package][workerpkg].
+
+### Node Scheduler
+
+A node is a service that stays online and manages a pool of task workers. A Funnel cluster
+runs a node on each VM. Nodes communicate with a Funnel scheduler, which assigns tasks
+to nodes based on available resources. Nodes handle starting workers when for each assigned
+task.
+
+Nodes aren't always required. In some cases it often makes sense to rely on an existing,
+external system for scheduling tasks and managing cluster resources, such as AWS Batch
+or HPC systems like HTCondor, Slurm, Grid Engine, etc. Funnel provides integration with
+these services that doesn't include nodes or scheduling by Funnel.
+
+See [Deploying a cluster](/docs/compute/deployment/) for more information about running a cluster of nodes.
+
+The node is accessible via the `funnel node` command and the [scheduler package][schedpkg].
 
-[galaxy]: https://galaxyproject.org/
-[cwl]: http://commonwl.org/
-[wdl]: https://software.broadinstitute.org/wdl/
 [tes]: https://github.com/ga4gh/task-execution-schemas
+[serverpkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/server
+[workerpkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/worker
+[schedpkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/compute/scheduler
+[clientpkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/client
+[tespkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/proto/tes
+[storagepkg]: https://github.com/ohsu-comp-bio/funnel/tree/master/storage
diff --git a/website/content/docs/compute.md b/website/content/docs/compute.md
@@ -0,0 +1,8 @@
+---
+title: Compute
+menu:
+  main:
+    weight: -5
+---
+
+# Compute
diff --git a/website/content/docs/guides/aws.md → website/content/docs/compute/aws-batch.md b/website/content/docs/guides/aws.md → website/content/docs/compute/aws-batch.md
@@ -1,19 +1,19 @@
 ---
-title: AWS Deployment
-
+title: AWS Batch
 menu:
   main:
-    parent: guides
+    parent: Compute
     weight: 20
 ---
 
-# Amazon Web Services
+
+# Amazon Batch
 
 This guide covers deploying a Funnel server that leverages [DynamoDB][0] for storage
 and [Batch][1] for task execution. You'll need to set up several resources 
 using either the Funnel CLI or through the provided Amazon web console.
 
-## Create Required AWS Batch Resources
+### Create Required AWS Batch Resources
 
 For Funnel to execute tasks on Batch, you must define a Compute Environment,
 Job Queue and Job Definition. Additionally, you must define an IAM role for your 
@@ -132,6 +132,10 @@ Worker:
         Secret: ""
 ```
 
+### Known issues
+
+Disk size and host volume management extra setup. The `Task.Resources.DiskGb` field does not have any effect. See [issue 317](https://github.com/ohsu-comp-bio/funnel/issues/317).
+
 [0]: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html
 [1]: http://docs.aws.amazon.com/batch/latest/userguide/what-is-batch.html
 [2]: http://docs.aws.amazon.com/batch/latest/userguide/Batch_GetStarted.html#first-run-step-2

diff --git a/website/content/docs/compute/deployment.md b/website/content/docs/compute/deployment.md
@@ -0,0 +1,94 @@
+---
+title: Deploying a cluster
+menu:
+  main:
+    parent: Compute
+    weight: -50
+---
+
+# Deploying a cluster
+
+This guide describes the basics of starting a cluster of Funnel nodes. 
+This guide is a work in progress.
+
+A node is a service
+which runs on each machine in a cluster. The node connects to the Funnel server and reports
+available resources. The Funnel scheduler process assigns tasks to nodes. When a task is
+assigned, a node will start a worker process. There is one worker process per task.
+
+Nodes aren't always required. In some cases it makes sense to rely on an existing,
+external system for scheduling tasks and managing cluster resources, such as AWS Batch,
+HTCondor, Slurm, Grid Engine, etc. Funnel provides integration with
+these services without using nodes or the scheduler.
+
+### Usage
+
+Nodes are available via the `funnel node` command. To start a node, run
+```
+funnel node run --config node.config.yml
+```
+
+To activate the Funnel scheduler, use the `manual` backend in the config.
+
+The available scheduler and node config:
+```
+# Activate the Funnel scheduler.
+Backend: manual
+
+Scheduler:
+  # How often to run a scheduler iteration.
+  # In nanoseconds.
+  ScheduleRate: 1000000000 # 1 second
+
+  # How many tasks to schedule in one iteration.
+  ScheduleChunk: 10
+
+  # How long to wait between updates before marking a node dead.
+  # In nanoseconds.
+  NodePingTimeout: 60000000000 # 1 minute
+
+  # How long to wait for a node to start, before marking the node dead.
+  # In nanoseconds.
+  NodeInitTimeout: 300000000000 # 5 minutes
+
+  # Node config.
+  Node:
+    # If empty, a node ID will be automatically generated using the hostname.
+    ID: ""
+
+    # Files created during processing will be written in this directory.
+    WorkDir: ./funnel-work-dir
+
+    # If the node has been idle for longer than the timeout, it will shut down.
+    # -1 means there is no timeout. 0 means timeout immediately after the first task.
+    Timeout: -1
+
+    # A Node will automatically try to detect what resources are available to it. 
+    # Defining Resources in the Node configuration overrides this behavior.
+    Resources:
+      # CPUs available.
+      # Cpus: 0
+      # RAM available, in GB.
+      # RamGb: 0.0
+      # Disk space available, in GB.
+      # DiskGb: 0.0
+
+    # For low-level tuning.
+    # How often to sync with the Funnel server.
+    # In nanoseconds.
+    UpdateRate: 5000000000 # 5 seconds
+
+    # RPC timeout for update/sync call.
+    # In nanoseconds.
+    UpdateTimeout: 1000000000 # 1 second
+
+    Logger:
+      # Logging levels: debug, info, error
+      Level: info
+      # Write logs to this path. If empty, logs are written to stderr.
+      OutputFile: ""
+```
+
+### Known issues
+
+The config uses nanoseconds for duration values. See [issue #342](https://github.com/ohsu-comp-bio/funnel/issues/342).
diff --git a/website/content/docs/guides/grid-engine.md → website/content/docs/compute/grid-engine.md b/website/content/docs/guides/grid-engine.md → website/content/docs/compute/grid-engine.md
@@ -1,15 +1,13 @@
 ---
-title: Open Grid Engine
-
+title: Grid Engine
 menu:
   main:
-    parent: guides
+    parent: Compute
     weight: 20
 ---
+# Grid Engine
 
-# Open Grid Engine
-
-Funnel can be configured to submit workers to [Open Grid Engine][ge] by making calls
+Funnel can be configured to submit workers to [Grid Engine][ge] by making calls
 to `qsub`.
 
 The Funnel server process needs to run on the same machine as the Grid Engine master.

diff --git a/website/content/docs/guides/htcondor.md → website/content/docs/compute/htcondor.md b/website/content/docs/guides/htcondor.md → website/content/docs/compute/htcondor.md
@@ -1,12 +1,10 @@
 ---
 title: HTCondor
-
 menu:
   main:
-    parent: guides
+    parent: Compute
     weight: 20
 ---
-
 # HTCondor
 
 Funnel can be configured to submit workers to [HTCondor][htcondor] by making 

diff --git a/website/content/docs/guides/pbs-torque.md → website/content/docs/compute/pbs-torque.md b/website/content/docs/guides/pbs-torque.md → website/content/docs/compute/pbs-torque.md
@@ -1,12 +1,10 @@
 ---
 title: PBS/Torque
-
 menu:
   main:
-    parent: guides
+    parent: Compute
     weight: 20
 ---
-
 # PBS/Torque
 
 Funnel can be configured to submit workers to [PBS/Torque][pbs] by making calls