-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Summary
A few extensions of (Mean) Average Precision to graded relevance have been proposed in the literature, some more sophisticated than others; one of the simplest ("grAP"), suggested in Moffat et al. 2022 (see reference below), uses literally the same formula as "traditional" AP
\frac {\sum_i r_i prec@i} {\sum_i r_i} = \frac {\sum_i r_i \sum_{j \leq i} r_j / i} {\sum_i r_i}
while removing the restriction on r_i's being binary labels.
It would be great if we could make this extended version of mean_average_precision aka map be usable as an alternative (or complementary) metric to ndcg for ranking tasks.
Motivation
Provides an alternative to NDCG for ranking graded items
Description
Two options:
- modify the implementation of mean_average_precision aka map to accept any nonnegative-but-not-necessarily binary labels, while preserving same behavior when the input is binary
- add support for a new metric "graded_average_precision" alongside mean_average_precision, catering to the broader case.
References
Alistair Moffat, Joel Mackenzie, Paul Thomas, and Leif Azzopardi. 2022. A Flexible Framework for Offline Effectiveness Metrics. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22). Association for Computing Machinery, New York, NY, USA, 578–587. https://doi.org/10.1145/3477495.3531924