This repository contains the NIST validation and scoring code components for the DSE and D3M evaluations. The DSE evaluation can be found at dse.nist.gov.
In order to run the tests, it is required to use python version 3.6.
Requires a local copy of the directory for the problem/dataset that contains:
- the dataset schema at
path_to_score_root/dataset_TEST/datasetDoc.json
- the problem schema at
path_to_score_root/problem_TEST/problemDoc.json
- the test
learningData.csv
atpath_to_score_root/dataset_TEST/tables/learningData.csv
Download the seed datasets at dse.nist.gov. Each problem/dataset has a SCORE
folder that contains this structure.
This package works with Python 3.6+ and requires the d3m core package.
To install latest released version:
$ pip install git+https://github.com/usnistgov/dval.git@master
To install a particular release of the package, e.g., v2018.4.28
:
$ pip install git+https://github.com/usnistgov/[email protected]
To install latest development (unreleased) version:
$ pip install git+https://github.com/usnistgov/dval.git@develop
dval valid_pipelines pipeline_log_file [pipeline_log_file ...]
Parameters:
pipeline_log_file
: path to the predictions file to validate
For example
dval valid_pipelines mylog1.json mylog2.json
In shells like bash, you can also do : dval valid_pipelines *.json
dval valid_predictions -d score_dir predictions_file [predictions_file ...]
Parameters:
score_dir
: path to the directory described in Section Requirements. Use theSCORE
directory of the seed datasets.predictions_file
: path to the predictions file to validate
dval score -d score_dir [-g ground_truth_file] [--validation | --no-validation] predictions_file [predictions_file ...]
Parameters:
score_dir
: path to the directory described in Section Requirements. Use theSCORE
directory of the seed datasets.ground_truth_file
: path to the ground truth file. If absent, will default toscore_dir/targets.csv
predictions_file
: path to the predictions file to score--validation | --no-validation
: validation is on by default. turn in off with--no-validation
dval valid_generated_problems ./test/generated_problems/correct_submission/
Parameters:
problems_directory
: path to directory containing the generated problems.
Build the Docker image from the Dockerfile:
git checkout v2018.4.20 # getting a specific version of the code
docker build -t dval .
The usage is the same as the CLI using a docker container but
For example, to validate a predictions.csv
file:
docker run -v /hostpath/to/data:/tmp/data dval valid_predictions -d /tmp/data/SCORE /tmp/data/predictions.csv
path_to_score_root = 'test/data/185_baseball_SCORE'
groundtruth_path = 'test/data/185_baseball_SCORE/targets.csv'
result_file_path = 'test/data/185_baseball_SCORE/mitll_predictions.csv'
Option 1: Using the Predictions class
>>> from dval.predictions import Predictions
>>> p = Predictions(result_file_path, path_to_score_root)
>>> p.is_valid()
True
>>> scores = p.score(groundtruth_path)
>>> scores
[Score(target='Hall_Of_Fame', metric='f1', scorevalue=0.691369766848)]
>>> scores[0]['scorevalue']
0.691369766848
with the Score object being a named tuple defined the following way
import collections
Score = collections.namedtuple('Score', ['target', 'metric', 'scorevalue'])
If a problem schema describes multiple targets and/or multiple metrics, the .score()
function will return a
list of Score
objects, one for each combination of (target, metric)
.
Option 2: Using the wrapper functions
>>> from dval.predictions import is_predictions_file_valid, score_predictions_file
>>> is_predictions_file_valid(result_file_path, path_to_score_root)
True
>>> scores = score_predictions_file(result_file_path, path_to_score_root, groundtruth_path)
>>> scores
[Score(target='Hall_Of_Fame', metric='f1', scorevalue=0.691369766848)]
>>> scores[0]['scorevalue']
0.691369766848
Checks that the validation code does on the prediction file include:
- Checks that file exists and is readable
- Checks the header (needs to be indexName, targetName1, [targetName2, ...])
- Check target types (from dataset schema data field types)
- Check length of the index
- Compare index with expected index
>>> from dval.pipeline_logs_validator import Pipeline
>>> Pipeline('path/to/my.json').is_valid()
True
Checks that the validation code does on the pipeline log files include:
- Checks that file exists and is readable
- Checks that the file is correct JSON
- Checks for all required fields
- Checks that
primitives
is a json list, with no duplicates - Checks that
pipeline_rank
is an integer
To run all tests: pytest
We have a test suite with the pytest
package and code coverage with coverage
. This requires the package coverage
and pytest
, both of which can be installed with pip
.
The following command runs all of the unit tests and outputs code coverage into htmlcov/index.html
coverage run --branch --source=./dval -m pytest -s test/ -v
coverage report -m
coverage html
Docs of the latest version of the master branch are available here (inside NIST only): https://d3m_g.ipages.nist.gov/dval
Docs were built using sphinx and autodoc with the following commands at the root directory:
sphinx-apidoc -o docs/api dval
sphinx-build -b html docs/ html_docs
And the web docs can be loaded in html_docs/index.html
License
The license is documented in the LICENSE file and on the NIST website.
Versions and releases:
See
- the repository tags for all releases. link for Gitlab host link for Github host
- the CHANGELOG file for a history of the releases.
- the
version
field insetup.cfg
.
Contact:
Please send any issues, questions, or comments to [email protected]