All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Report
now allows for export to Pandas DataFrame thanks to @hotchpotch contrib.min-max norm
now allows for inverting min and max for distance scores normalization thanks to @MochiXu, @diegoceccarelli, and @AndreP-git.
Run
now has an additional property to store metrics standard deviation.evaluate
now hasreturn_std
flag to compute metrics standard deviation.
Qrels.from_df
now checks that scores arenumpy.int64
to avoid errors on Windows.Run.from_df
now checks that scores arenumpy.float64
to avoid errors on Windows.
- All
Run
import methods allow for specifying thename
of the run.
- Fixed misleading error messages when importing
Qrels
andRun
frompandas.DataFrame
with wrongdtypes
.
- Added support for importing qrels from
parquet
files inqrels.py
. - Added support for importing runs from
parquet
files inrun.py
. - Added support for exporting qrels as
pandas.DataFrame
inqrels.py
. - Added support for exporting runs as
pandas.DataFrame
inrun.py
. - Added support for saving qrels as
parquet
files inqrels.py
. - Added support for saving runs as
parquet
files inrun.py
.
- Fixed
f1
when there are no relevants.
- Moved
numba
threading layer settings toranx/__init__.py
.
- Removed dependency from
pytrec_eval
.
- Added support for gzipped TREC files to
from_file
inqrels.py
. - Added support for gzipped TREC files to
from_file
inrun.py
. - Added
name
parameter tofrom_file
inrun.py
.
- Fixed
rank_biased_precision
considering relevance as binary instead of graded. - Fixed high memory consumption for
qrels
andrun
.
- Fixed missing metric labels for
dcg
anddcg_burges
inreport.py
.
- Added
dcg
anddcg_burges
among the available metrics.
- Fixed missing dependency
seaborn
.
- Fixed a bug affecting the download of ranxhub runs with special symbols in their ids, such as
+
.
- Changed
save
inranxhub.py
to automatically save average metric scores.
- Fixed a bug affecting
make_comparable
inrun.py
: runs were not sorted after this operation, resulting in wrong metrics computation afterwards.
- It is now possible to plot Interpolated Precision-Recall Curve. Click here for further details.
- Added
make_comparable
torun.py
. It makes a run comparable to a given qrels whether the run misses results for queries appearing in the qrels or have results for additional queries, which are removed. - Added
make_comparable
parameter toevaluate.py
. - Added
make_comparable
parameter tocompare.py
.
- Fixed a bug affecting
Tukey's HSD Test
: results from the test were not converted to proper dtypes from strings, causing the superscript reporting statistical significance differences inreport.py
to be wrong.
- Changed
tukey_hsd_test.py
to usetukey_hsd
provided byscipy
. ranx
now requirespython>=3.8
.ranx
now requiresscipy>=1.8
.
- Removed dependency from
statsmodels
.
- Fixed a bug affecting
precision.py
,recall.py
, andf1.py
:numba
does not raise ZeroDivisionError, added a control to make sure zero is returned when no retrieved results are provided for a specific query. - Fixed a bug in
f1.py
: missing argument in function call.
Sorry, I have been lazy.
- Fixed a bug in
posfuse.py
:numba
does not raise out of bounds error in some specific cases, added a control to make sure ranking positions with no associated probability get 0 probability. - Fixed a bug in
baysfuse.py
: as it uses log odds, which can be negative,comb_sum
cannot be used. Added aodds_sum
function to combine the log odds.
- Fixed a bug in
data_structures/common.py:sort_dict_by_value
that was preventing result list sorting to be consistent for documents with the same score. - Fixed a bug causing original runs to be modified by fusion methods.
- Fixed a bug in
max_norm.py
,min_max_norm.py
, andsum_norm.py
:min
andmax
functions called on empty lists do not raise error inNumba
causing downstream miscalculations.
- Fixed a bug in
bordafuse.py
:get_candidates
raised error if no run had retrieved docs for a given query. - Fixed a bug in
borda_norm.py
:get_candidates
raised error if no run had retrieved docs for a given query. - Fixed a bug in
condorcet.py
:get_candidates
raised error if no run had retrieved docs for a given query.
- Fixed a bug in
report.py:Report
: some metric labels were missing. SciPy
version explicitly stated insetup.py
to avoid errors.
Qrels
'ssave
andfrom_file
functions now automatically infer file extension.kind
parameter can be used to override default behavior.Qrels
'ssave
andfrom_file
functions are now much faster withjson
files thanks toorjson
.Run
'ssave
andfrom_file
functions now automatically infer file extension.kind
parameter can be used to override default behavior.Run
'ssave
andfrom_file
functions are now much faster withjson
files thanks toorjson
.Two-sided Paired Student's t-Test
is now the default statistical test used when callingcompare
. It is much faster thanFisher's
and usually agrees with it.
Sorry, I have been lazy.
- Fixed a bug in
report.py:Report.to_dict
.
- Added
from_ir_datasets
toqrels.py
. It allows loading qrels fromir_metadata
.
- Added
paired_student_t_test
tostatistical_testing.py
. - Added
stat_test
parameter tocompare
. Defaults tofisher
. - Added
stat_test
parameter toreport
. Defaults tofisher
.
Report
'sto_latex
function now takes into account the newly introducedstat_test
parameter to correctly generating LaTeX tables' captions.Report
'sto_dict
function now takes into account the newly introducedstat_test
parameter and adds it to the output dictionary.Report
'ssave
function now takes into account the newly introducedstat_test
parameter and adds it to the output JSON file.
- Added
show_percentages
parameter toReport
. Defaults toFalse
. - Added
show_percentages
parameter tocompare
. Defaults toFalse
. - Added
rounding_digits
parameter tocompare
. Defaults to3
. - Added usage example notebooks for Google Colab.
- [IMPORTANT]
Qrels
andRun
now accept a Python Dictionary as initialization parameter and this is the preferred way of creating new instances for those classes. They also accept aname
parameter. None of those is mandatory, so it should not break code based on previousranx
version although this could be changed in the future. - [BREAKING CHANGE]
Qrels
andRun
save
functiontype
parameter renamed tokind
to prevent it to be interpreted as thetype
Python utility function. - [BREAKING CHANGE]
Qrels
andRun
save
function now defaults tojson
instead oftrec
for thekind
parameter (previously calledtype
). - [BREAKING CHANGE]
Qrels
andRun
from_file
functiontype
parameter renamed tokind
to prevent it to be interpreted as thetype
Python utility function. - [BREAKING CHANGE]
Qrels
andRun
from_file
function now defaults tojson
instead oftrec
for thekind
parameter (previously calledtype
). rounding_digits
parameter ofReport
now defaults to3
.Report
'sto_latex
function now produces a simplified LaTeX table.- Various improvements to
Report
source code.