Skip to content

Releases: EducationalTestingService/rsmtool

RSMTool 8.1.1

03 Jun 22:42
a0db298
Compare
Choose a tag to compare

This is a bugfix release with some minor improvements.

  • Continuous integration build for RSMTool migrated from Travis CI to Gitlab CI.

  • Minor bug fixed in parse_json_with_comments to handle URLs correctly.

  • Minor updates to warnings and documentation.

RSMTool 8.1.0

03 Mar 16:37
e77b5cf
Compare
Choose a tag to compare

This is a minor but backwards-incompatible release which includes changes necessary to make RSMTool compatible with SKLL v2.5.

What's new

  • RSMTool is now compatible with SKLL 2.5!

💥 Breaking Changes 💥

  • Python 3.6 is no longer officially supported since the latest versions of pandas and numpy have dropped support for it. RSMTool officially supports Python 3.7, 3.8, and 3.9.

  • RSMTool no longer supports .xls files. For users who use Excel to prepare their data, we continue supporting xlsx files.

  • Models trained with earlier versions of RSMTool can no longer be used to generate predictions. If you use rsmpredict or compute_and_save_predictions to generate predictions based on existing models, you will need to re-train the models.

RSMTool 8.0.2

30 Sep 17:25
29b575d
Compare
Choose a tag to compare

This is a bugfix release with some minor improvements.

  • The version of nbconvert used by RSMTool is now pinned to <6.0 due to a change in v6.0 and above that broke RSMTool report generation. We will remove the pin in a future release when the upstream issue is fixed.

  • RSMTool reports no longer displays a pie chart for the model coefficients if any of the coefficients are negative.

  • Minor updates for compatibility with external packages.

  • Minor updates to warnings and documentation.

RSMTool 8.0.1

07 Aug 21:45
bdfeba9
Compare
Choose a tag to compare

This is a bugfix release with some minor improvements.

  • Update the code for compatibility with pandas 1.1.0.

  • prmse_true no longer raises an error if there are no double-scored responses. Instead the function displays a warning and returns None.

  • Command line tools rsmtool, rsmeval, rsmpredict, rsmcompare and rsmsummarize no longer raise an error if a user does not provide any command line arguments. Instead the tools display the help message.

  • Minor updates to documentation.

  • Improvements to the testing and coverage measurement process.

RSMTool 8.0

11 May 14:20
0dcc479
Compare
Choose a tag to compare

This is a major new release. It includes a lot of new functionality and multiple changes to the API.

⚡️ RSMTool 8.0 is backwards incompatible with previous versions ⚡️

💡 New features 💡

Dependencies

  • RSMTool is now compatible with SKLL v2.1

  • All dependencies other than skll are now unpinned.

  • RSMTool now supports Python versions 3.6, 3.7 and 3.8.

Interactive generation of configuration files

  • Configuration files for rsmtool, rsmeval, rsmpredict, rsmcompare and rsmsummarize can now be generated automatically, either interactively or non-interactively. This exciting new functionality makes it easier to keep track of the many configuration options available in RSMTool and greatly simplifies the process of setting up the experiment. Watch the video demonstrating the new interactive generation or read the documentation.

Passing hyperparameters to SKLL models

  • It is now possible to pass custom hyperparameter values to skll learners used through RSMTool. This is done using a new configuration field skll_fixed_parameters. The parameters are also displayed in the report.

Generalized version of PRMSE

  • The formula for PRMSE has been updated to a more general version derived by Matthew S. Johnson that allows computation of PRMSE for any number of raters. For two raters, the formula returns the same result as the formula used in previous versions of the tool.

  • The API now provides a new function prmse_true() which accepts scikit-learn style parameters and returns the PRMSE value.

  • It is now possible to supply error variance of human raters necessary to compute PRMSE. This can be useful when the experiments require computing this parameter on data other than the evaluation set. This can be done via the rater_error_variance field in the configuration file or by passing the variance as a parameter to prmse_true().

Changes to RSMTool reports

  • The report now always displays the headers for the "Consistency" and "True score evaluations" sections. If no second score is available, the report will indicate this. If you do not want these section headers to appear in your report, use the general_section field to exclude these sections. TIP: If you use automatic configuration generation, you configuration file will contain the full list of available sections that you can edit to exclude unnecessary sections.

💥 Incompatible Changes 💥

File formats

  • rsmcompare and rsmsummarize no longer support experiments that were generated with earlier versions of RSMTool. You will need to re-run the experiments that you want to compare or summarize.

  • rsmtool no longer supports old-style configuration files (not used since v5.5 or earlier).

  • rsmtool no longer supports feature files in .json format (not used since v5.5 or earlier).

  • The Intermediate file containing true score evaluations true_score_eval no longer contains variance of human scores. This information can still be obtained from consistency files.

API Changes

  • The Configuration and ConfigurationParser objects in the
    configuration_parser
    module have been fully refactored. A new Configuration object can now be instantiated using a dictionary with keys using the same name as the fields in the configuration file . Validation and normalization is now done as part of initialization. See this PR for more detail.

  • Configuration objects no longer have a filepath attribute. Use the configdir attribute to indicate what any relative paths in the dictionary are relative to.

  • Functions in the erstwhile rsmtool.utils module have been moved to new locations. This includes several functions for computing evaluation metrics (agreement, difference_of_standardized_means, partial_correlations, quadratic_weighted_kappa, and standardized_mean_difference). See the API documentation for the new location of these functions.

  • The API for computing PRMSE has changed. See the API documentation for new functions.

🛠 Bugfixes & Improvements 🛠

  • v7.1.0 did not allow run_* functions to accept pathlib.Path objects for paths to configuration files. This is now allowed.

  • Error messages and warnings produced by RSMTool are now more meaningful and consistent.

  • Multiple changes to improve code readability and consistency.

RSMTool 7.1

24 Feb 22:57
0eb9a96
Compare
Choose a tag to compare

This is a minor release which includes changes necessary to make RSMTool compatible with SKLL 2.0.

What's new

  • RSMTool is now compatible with SKLL 2.0.

  • The implementation of scipy.stats.pearsonr used in RSMTool to compute Pearson's correlation coefficient has changed. The new implementation is equivalent to the old one in the majority of cases but tends to produce slightly different values for very small N. See #343 for further detail.

  • If you use the Dash app on macOS, you can now download the complete RSMTool documentation for offline use. Go to Dash preferences, click on "Downloads", then "User Contributed", and search for "RSMTool".

  • The conda package for RSMTool is now available from the official ETS conda channel.

API changes

  • The run_experiment, run_evaluation, run_comparison, run_summary, and compute_and_save_predictions functions now accept Python dictionaries as input.

  • The .filepath attribute of Configuration object will be deprecated in a future version and replaced with two new atttributes: configdir and filename. Use join(configdir, filename) if you need the full path to the configuration file.

Other

  • Minor changes to the documentation.
  • Many functions used for tests have been refactored for efficiency.

RSMTool 7.0

19 Dec 20:43
07f9126
Compare
Choose a tag to compare

This is a major release which includes changes to several key evaluation metrics computed by RSMTool.

What's new

Changes to evaluation metrics

The exact definitions of all evaluation metrics and their method of computation are now available in

Changes to evaluation metrics

  • Quadratic weighted kappa (QWK) for raw, raw_trim, scale and scale_trim scores is now computed on continuous score values using formula suggested by Haberman (2019). In previous versions of RSMTool such continuous score values were rounded to compute QWK.

  • Subgroup differences are now evaluated using a new metrics "Difference in standardized means". This metrics was designed to be more robust to differences in scale between human and machine scores.

  • SMD for human-human agreement is now computed using pooled standard deviation of H1 and H2 for the double-scored sample in the denominator.

  • The default tolerance for score postprocessing is now set to 0.4998 (instead of 0.49998). This may result in small changes to the values of all evaluation metrics for raw_trim and scale_trim scores. See below for new configuration files if you need to define custom tolerance.

New evaluation metrics

New configuration settings

  • A new configuration setting experiment_names for RSMSummarize allows specifying custom names for each experiment. These will be used to refer to the experiments in intermediate output files and in the report.

  • A new configuration setting trim_tolerance allows specifying custom tolerance when trimming scores to ceiling and floor values in RSMTool and RSMEval.

  • A new configuration setting min_n_per_group allows defining a threshold so that only groups with more than a certain number of members are included into the report. All groups are still included into the intermediate output files.

Other new functionality

API changes

Bugfixes

  • partial_correlations() function has been updated to return a correctly formatted matrix in a situation where the covariance matrix is very close to zero.

  • The reports have been updated to correctly display plots for features with very long names.

v6.1.0

20 Dec 16:19
afde3cc
Compare
Choose a tag to compare

This is a major release which includes a number of improvements primarily aimed to increase the flexibility of RSMTool API.

What's New

New functionality

  • RSMTool now supports input files in SAS SAS7BDAT format.

  • New learner NNLRIterative. This is a new built-in linear regression model that learns empirical OLS regression weights with feature selection using an iterative implementation of non-negative least squares regression.

  • Custom truncation thresholds. The user can now remove outliers using pre-existing truncation thresholds specified in the features file by using the field use_truncation_thresholds

  • Users can now run the .ipynb notebook generated from the experiment interactively, without having to set any environment variables. Each experiment now generates a (hidden) environment JSON file, which the notebook will automatically read.

API changes

  • There is now a separate function utils.standardized_mean_difference() that can be used to compute SMD.

  • A new function reader.try_to_load_file() allows API user to specify what they want to happen if a file cannot be loaded. The functions can be set to return None, to raise warning, or to raise error.

  • DataContainer class now includes additional helper methods. These methods allow users to drop() and rename() data frames in the DataContainer, and to select data frames using a specified prefix or suffix with the get_frames() method.

  • Configuration class now includes several additional helper methods pop() and copy().

  • utils.get_thumbnail_as_html() now accepts an optional argument path_to_thumbnail which allows using two different paths for thumbnails and full-size images.

Other

  • Support for seaborn 0.9.0 and statsmodels 0.9.0.

  • Support for numpy 1.14.0, scipy 1.1.0, and pandas 0.23.0+.

  • Support for ipython 6.5.0 and notebook 5.7.2.

  • The documentation incorrectly stated the order of operations in the processing pipeline: the change of feature sign (if applicable) happens after standardization.

  • If the user specifies a list of features and one of such features has zero variance, the tool now displays the correct error message.

  • The logging messages displayed by check_flag_column now indicate the partition if different flag columns were used for training and evaluating the model.

  • Miscellaneous minor bug fixes in the notebooks.

Version 6.0.1

11 May 19:35
152aecc
Compare
Choose a tag to compare

This is a bugfix release.

  • The "System Information" section of the reports now uses pkg_resources instead of pip to get the list of installed packages since pip disallows the use of its internal API starting with v10.
  • Fix incorrect formatting in the documentation.
  • Update ipython and notebook package versions in order to address an incompatibility issue with the latest version of the tornado web server that affects interactive use of ipython notebook but not the report generation itself.
  • Updated the description of the marginal/partial correlation plot in the report.

Version 6.0

28 Feb 22:53
6e9f2e0
Compare
Choose a tag to compare

What's new?

This is a major release. The entire code base has been fully refactored to use a much more object-oriented design. This should make it much easier to make improvements and to add extensions. As result, there have been significant changes to the RSMTool API (see link in documentation below for more details).

New features

New learners

  • New regressors from the latest SKLL release (v1.5.1) have been added to rsmtool.

  • rsmtool can now be used with both regressors and classifiers from SKLL, including classifiers that produce probabilistic output which can be used to produce expected values as predictions.

    See the SKLL documentation for the full list of learners.

Enhanced outputs

  • Users can now specify the file_format configuration option to save intermediate files in either tsv, csv, or xlsx format.
  • Users can specify a use_thumbnails configuration option that will embed clickable thumbnails in the HTML report, rather than full-sized images. Upon clicking the thumbnails, full-sized images will be displayed in a new window. This is particularly useful for larger reports with many images, improving both the readability and the loading speed of such reports.
  • Reports for rsmtool, rsmeval, and rsmsummarize now contain a new section containing links to intermediate files (intermediate_file_paths.ipynb) so that users can now easily inspect these files from the report itself.

New configuration options

  • Users can now specify features in the configuration file as a list. When providing a list of features, signs or transformations cannot be specified. This makes creating configuration files for simple experiments much easier and faster.
  • Users can now specify a skll_objective for tuning the SKLL learners used in their experiments.
  • Users can now specify a flag_column_test configuration option to use different flags for the test file and the training file.
  • Users can now specify a standardize_features boolean option if they do not want the feature values standardized, which is the default.

New evaluations

  • rsmtool and rsmeval now compute disattenuated correlations if the data includes two human scores.

Code changes

  • New helper classes have been added to rsmtool, which allow easy reading, writing, and manipulation of multiple pandas data frames.
    • container.DataContainer(): A class to encapsulate multiple data frames.
    • reader.DataReader(): A class to read multiple tabular files into a DataContainer() object.
    • writer.DataWriter(): A class to write all data frames contained in aDataContainer() object to separate files, with a specified file extension.
  • The rsmtool module is now installable via pip, in addition to being installable with conda.
  • preprocessor.trim() can now take both numpy arrays and lists as inputs.

Bugfixes

  • Fixed warning in rsmcompare when computing summary evaluations.
  • Previously confusion matrices forced human scores to integers, while score distributions used the value "as is". Now both analyses use rounded human scores.
  • Length columns are now forced to numeric, if they are non-numeric.

Documentation