Skip to content

Commit

Permalink
Maestro v 1.1.9dev1 Compatibility (#388)
Browse files Browse the repository at this point in the history
Maestro up to date compatibility 
Also unpacked maestro DAG to just use what we need, which should help reduce task message size and perhaps allow us to use other serializers in the future.
  • Loading branch information
bgunnar5 authored Dec 15, 2022
1 parent c2d2c2b commit 27e51e7
Show file tree
Hide file tree
Showing 15 changed files with 752 additions and 46 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/push-pr_workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ jobs:

strategy:
matrix:
python-version: ['3.6', '3.7', '3.8', '3.9', '3.10']
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v2
Expand Down Expand Up @@ -158,7 +158,7 @@ jobs:

strategy:
matrix:
python-version: ['3.6', '3.7', '3.8', '3.9', '3.10']
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v2
Expand Down
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [unreleased]
### Added
- Added support for Python 3.11
- Update docker docs for new rabbitmq and redis server versions
- Added lgtm.com Badge for README.md
- More fixes for lgtm checks.
Expand All @@ -18,8 +19,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Capability for non-user block in yaml
- .readthedocs.yaml and requirements.txt files for docs
- Small modifications to the Tutorial, Getting Started, Command Line, and Contributing pages in the docs
- Compatibility with the newest version of Maestro (v. 1.1.9dev1)
- JSON schema validation for Merlin spec files
- New tests related to JSON schema validation
- Instructions in the "Contributing" page of the docs on how to add new blocks/fields to the spec file
- Brief explanation of the $(LAUNCHER) variable in the "Variables" page of the docs

### Changed
- Removed support for Python 3.6
- Rename lgtm.yml to .lgtm.yml
- New shortcuts in specification file (sample_vector, sample_names, spec_original_template, spec_executed_run, spec_archived_copy)
- Changed "default" user password to be randomly generated by default.
Expand All @@ -29,6 +36,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added ssl to the broker and results backend server checks when "merlin info" is called
- Removed theme_override.css from docs/_static/ since it is no longer needed with the updated version of sphinx
- Updated docs/Makefile to include a pip install for requirements and a clean command
- Changed what is stored in a Merlin DAG
- We no longer store the entire Maestro ExecutionGraph object
- We now only store the adjacency table and values obtained from the ExecutionGraph object
- Modified how spec files are verified
- Updated requirements to require maestrowf 1.9.1dev1 or later

### Fixed
- Fixed return values from scripts with main() to fix testing errors.
Expand All @@ -38,6 +50,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Including .temp template files in MANIFEST
- Styling in the footer for docs
- Horizontal scroll overlap in the variables page of the docs
- Reordered small part of Workflow Specification page in the docs in order to put "samples" back in the merlin block

## [1.8.5]
### Added
Expand Down
84 changes: 84 additions & 0 deletions docs/source/merlin_developer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,87 @@ Merlin has style checkers configured. They can be run from the Makefile:
.. code-block:: bash
$ make check-style
Adding New Features to YAML Spec File
+++++++++++++++++++++++++++++++++++++

In order to conform to Maestro's verification format introduced in Maestro v1.1.7,
we now use `json schema <https://www.json-schema.org>`_ validation to verify our spec
file.

If you are adding a new feature to Merlin that requires a new block within the yaml spec
file or a new property within a block, then you are going to need to update the
merlinspec.json file located in the merlin/spec/ directory. You also may want to add
additional verifications within the specification.py file located in the same directory.

.. note::
If you add custom verifications beyond the pattern checking that the json schema
checks for, then you should also add tests for this verification in the test_specification.py
file located in the merlin/tests/unit/spec/ directory. Follow the steps for adding new
tests in the docstring of the TestCustomVerification class.

Adding a New Property
*********************

To add a new property to a block in the yaml file, you need to create a
template for that property and place it in the correct block in merlinspec.json. For
example, say I wanted to add a new property called ``example`` that's an integer within
the ``description`` block. I would modify the ``description`` block in the merlinspec.json file to look
like this:

.. code-block:: json
"DESCRIPTION": {
"type": "object",
"properties": {
"name": {"type": "string", "minLength": 1},
"description": {"type": "string", "minLength": 1},
"example": {"type": "integer", "minimum": 1}
},
"required": ["name", "description"]
}
If you need help with json schema formatting, check out the `step-by-step getting
started guide <https://json-schema.org/learn/getting-started-step-by-step.html>`_.

That's all that's required of adding a new property. If you want to add your own custom
verifications make sure to create unit tests for them (see the note above for more info).

Adding a New Block
******************

Adding a new block is slightly more complicated than adding a new property. You will not
only have to update the merlinspec.json schema file but also add calls to verify that
block within specification.py.

To add a block to the json schema, you will need to define the template for that entire
block. For example, if I wanted to create a block called ``country`` with two
properties labeled ``name`` and ``population`` that are both required, it would look like so:

.. code-block:: json
"COUNTRY": {
"type": "object",
"properties": {
"name": {"type": "string", "minLength": 1},
"population": {
"anyOf": [
{"type": "string", "minLength": 1},
{"type": "integer", "minimum": 1}
]
}
},
"required": ["name", "capital"]
}
Here, ``name`` can only be a string but ``population`` can be both a string and an integer.
For help with json schema formatting, check out the `step-by-step getting started guide
<https://json-schema.org/learn/getting-started/step-by-step.html>`_.

The next step is to enable this block in the schema validation process. To do this we need to:

#. Create a new method called verify_<your_block_name>() within the MerlinSpec class
#. Call the YAMLSpecification.validate_schema() method provided to us via Maestro in your new method
#. Add a call to verify_<your_block_name>() inside the verify() method

If you add your own custom verifications on top of this, please add unit tests for them.
33 changes: 17 additions & 16 deletions docs/source/merlin_specification.rst
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,23 @@ see :doc:`./merlin_variables`.
batch:
type: local
machines: [host3]
###################################################
# Sample definitions
#
# samples file can be one of
# .npy (numpy binary)
# .csv (comma delimited: '#' = comment line)
# .tab (tab/space delimited: '#' = comment line)
###################################################
samples:
column_labels: [VAR1, VAR2]
file: $(SPECROOT)/samples.npy
generate:
cmd: |
python $(SPECROOT)/make_samples.py -dims 2 -n 10 -outfile=$(INPUT_PATH)/samples.npy "[(1.3, 1.3, 'linear'), (3.3, 3.3, 'linear')]"
level_max_dirs: 25
####################################
# User Block (Optional)
####################################
Expand Down Expand Up @@ -327,19 +344,3 @@ see :doc:`./merlin_variables`.
print "OMG is this in python2? Change is bad."
print "Variable X2 is $(X2)"
shell: /usr/bin/env python2
###################################################
# Sample definitions
#
# samples file can be one of
# .npy (numpy binary)
# .csv (comma delimited: '#' = comment line)
# .tab (tab/space delimited: '#' = comment line)
###################################################
samples:
column_labels: [VAR1, VAR2]
file: $(SPECROOT)/samples.npy
generate:
cmd: |
python $(SPECROOT)/make_samples.py -dims 2 -n 10 -outfile=$(INPUT_PATH)/samples.npy "[(1.3, 1.3, 'linear'), (3.3, 3.3, 'linear')]"
level_max_dirs: 25
33 changes: 30 additions & 3 deletions docs/source/merlin_variables.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,9 +113,10 @@ Reserved variables
.. code-block:: bash
for path in $(MERLIN_PATHS_ALL)
do
ls $path
done
do
ls $path
done
-
::

Expand Down Expand Up @@ -159,6 +160,32 @@ Reserved variables

$(MERLIN_INFO)/*.expanded.yaml

The ``LAUNCHER`` Variable
+++++++++++++++++++++

``$(LAUNCHER)`` is a special case of a reserved variable since it's value *can* be changed.
It serves as an abstraction to launch a job with parallel schedulers like :ref:`slurm<slurm>`,
:ref:`lsf<lsf>`, and :ref:`flux<flux>` and it can be used within a step command. For example,
say we start with this run cmd inside our step:

.. code:: yaml
run:
cmd: srun -N 1 -n 3 python script.py
We can modify this to use the ``$(LAUNCHER)`` variable like so:

.. code:: yaml
batch:
type: slurm
run:
cmd: $(LAUNCHER) python script.py
nodes: 1
procs: 3
In other words, the ``$(LAUNCHER)`` variable would become ``srun -N 1 -n 3``.

User variables
-------------------
Expand Down
2 changes: 1 addition & 1 deletion merlin/spec/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

BATCH = {"batch": {"type": "local", "dry_run": False, "shell": "/bin/bash"}}

ENV = {"env": {"variables": {}, "sources": {}, "labels": {}, "dependencies": {}}}
ENV = {"env": {"variables": {}, "sources": [], "labels": {}, "dependencies": {}}}

STUDY_STEP_RUN = {"task_queue": "merlin", "shell": "/bin/bash", "max_retries": 30}

Expand Down
Loading

0 comments on commit 27e51e7

Please sign in to comment.