-
Notifications
You must be signed in to change notification settings - Fork 0
Tutorial
Before using MESS, the student responses to your exam need to be formatted in the tab-delimited (TSV) format supported by MESS, which is the following:
Student | Correct | Q1 | Q2 | Q3 | ... |
---|---|---|---|---|---|
[email protected] | Q1,Q3 | A | blue | 42 | ... |
... | ... | ... | ... | ... | ... |
- The top row is a header row, and each subsequent row corresponds to a single student
- The leftmost column (labeled "Student" in the example above) should contain a unique student identifier (e.g. email address, student ID, etc.)
- The second leftmost column (labeled "Correct" in the example above) should contain a comma-separated list of questions the student answered correctly
- Each remaining column (labeled "Q1", "Q2", etc. in the example above) should correspond to a single exam question: cell (i,j) of the spreadsheet (excluding the two leftmost columns and the top row) should be the response student i submitted for question j
We will provide helper scripts to convert from the export formats supported by popular platforms. These scripts can be found in the helper_scripts
folder of the MESS GitHub repository. If you are using a platform that exports exam responses in a format that is not easily converted to the format supported by MESS, please feel free to submit a GitHub Issue that includes the following information:
- Name of + details about the platform you are using
- Example anonymized file(s) in the format exported by the platform you are using
- Description of the file format
We will try to add a conversion script accordingly. If you have written your own conversion script and would like to contribute, please feel free to share it with us via Pull Request or GitHub Issue (whichever is easiest for you).
To get an initial idea of how the similarity score distribution looks, first run MESS on your dataset using the default settings, specifying the input TSV file (-i/--input
), the output TSV file (-ot/--output_tsv
), and the output PDF file (-op/--output_pdf
). For example, using the provided example_input.tsv
(assuming MESS.py
is in your PATH
):
MESS.py -i example_input.tsv -ot output.tsv -op output.pdf
The output PDF should look something like the following:
- The dashed curve is the Kernel Density Estimate (KDE) of the similarity score distribution from your exam responses
- The solid line is the Probability Density Function (PDF) of the best-fit Exponential distribution
In its default settings, MESS fits the Exponential distribution PDF on the entirety of the similarity score KDE. However, as can be seen in the example above, outliers skew the best-fit Exponential distribution PDF, whereas we want to fit this theoretical distribution on a close-to-linear segment of the empirical similarity score KDE. Try to find a window in which the similarity score KDE is roughly linear, such as between 0.025 and 0.12 (shown in red) in the example dataset:
Rerun MESS, but this time, specify the minimum and maximum score for the regression (-rm/--reg_min
and -rM/--reg_max
, respectively). For example, using the range [0.025, 0.12]:
MESS.py -i example_input.tsv -ot output_2.tsv -op output_2.pdf -rm 0.025 -rM 0.12
As can be seen, the best-fit Exponential distribution's PDF now aligns much more nicely with the near-linear segment of the empirical similarity score distribution's KDE:
You may need to keep iterating on this step until you are satisfied with the fit. Note that the regression only impacts the p-values (and thus corrected q-values): it does not impact the similarity scores themselves whatsoever.
Once you have found an appropriate range for the Exponential distribution regression, there are some optional adjustments you may want to perform.
While a MESS score and p-value is calculated for every pair of students, performing multiple significance tests will require Multiple Hypothesis Test Correction, and performing too many significance tests will reduce statistical power. The MESS scores at the left end of the distribution are not interesting, so including them in the significance tests reduces statistical power without any benefit. As such, MESS will only perform significance tests on the top k MESS scores.
By default, MESS will automatically detect a reasonable value of k. Specifically, it will search for the smallest MESS score x > median such that the MESS score histogram has a count of 0 at x, and it will perform statistical tests on all k MESS scores that are greater than x. Alternatively, you can provide a value of k via -nt/--num_tests
, which you will likely want to do if you find that the histogram has gaps earlier than you want to test (which is expected due to the long tail of the Exponential distribution).
By default, to compute corrected q-values from p-values, MESS will use the Benjamini-Hochberg Procedure. You can adjust this selection via -c/--correction
:
-
benjamini_hochberg
: Benjamini-Hochberg Procedure (default) -
bonferroni
: Bonferroni Correction -
none
: No Correction (q-values are equal to p-values)- This should only be used if you will be ignoring q-values entirely
Note that the selection of multiple hypothesis test correction technique will only impact the q-values in the output TSV file: the figure in the output PDF file will not be impacted.
The aesthetics of the figure in the output PDF can be configured via the command line arguments.
-
-kc/--kde_color
: KDE Color -
-kl/--kde_linestyle
: KDE Linestyle -
-kw/--kde_linewidth
: KDE Line Width
-
-rc/--reg_color
: Regression Color -
-rl/--reg_linestyle
: Regression Linestyle -
-rw/--reg_linewidth
: Regression Line Width
-
-t/--title
: Figure Title -
-xl/--xlabel
: X-Axis Label -
-yl/--ylabel
: Y-Axis Label
-
-xm/--xmin
: Minimum X -
-xM/--xmax
: Maximum X -
-ym/--ymin
: Minimum Y -
-yM/--ymax
: Maximum Y -
--no_ylog
: Don't Plot Y-Axis in Log-Scale
Niema Moshiri 2021