-
Notifications
You must be signed in to change notification settings - Fork 0
Methods
Let
The ESS has a minimum of 0, which implies no identical incorrect responses, and a maximum that approaches 1, which implies identical and unique incorrect responses on all exam questions.
The ESS motivates a simple systematic approach for detecting exam similarity: compute the ESS across all pairs of students, sort pairs of students in descending order of ESS, and select pairs of students with statistically significantly large scores to investigate further as suspected cases of potential collaboration. However, a critical question arises: how does one determine an ESS threshold above which scores will be deemed as "significantly large"?
In a hypothetical course with infinite students, assuming that the vast majority of the possible pairs of students did not collaborate on an exam, we would expect the distribution of ESSs computed across all pairs of students in the class to fit closely to the null distribution, with pairs of students who did collaborate appearing as outliers with larger ESSs than one would expect. As the number of students decreases, the fit would worsen, especially at the ends of the distribution due to reduced sampling. When Kernel Density Estimates (KDEs) of the distributions of Exam Similarity Scores for a given exam are plotted in log-scale, as expected, the ends of the distribution are noisy, but there is a consistent close-to-linear stretch for central values of the ESS distribution. The Probability Density Function (PDF) of an Exponential distribution with rate parameter λ and location parameter μ is the following:
Therefore, the log of the PDF of an Exponential distribution with rate parameter
Thus, given a line
This motivates a simple approach for computing theoretical p-values for ESS values computed from all pairs of students:
- Compute the KDE of the distribution of ESSs
- Regress a line from the near-linear segment of the log-KDE
- Estimate the Exponential parameters from the line
- Compute p-values from the Exponential distribution
The statistical test is one-sided: specifically, the p-value associated with a given ESS x is the probability of observing an ESS greater than or equal to x purely by chance. Therefore, the p-value for a given ESS x is simply the area under the Cumulative Distribution Function (CDF) of Exponential(λ,μ) for the range X ≥ x.
When we compute a p-value for every possible pair of students and check each p-value for statistical significance, we are performing multiple simultaneous hypothesis tests. To control the False Discovery Rate (FDR), we can perform a correction (e.g. Bonferroni or Benjamini-Hochberg) to compute an FDR-controlled adjusted p-value, also known as a q-value. The resulting q-values can be compared against a statistical significance threshold, e.g. q ≤ 0.05, to provide an automated similarity detection algorithm.
Niema Moshiri 2021