Support graded (mean) average precision metric for ranking

Summary

A few extensions of (Mean) Average Precision to graded relevance have been proposed in the literature, some more sophisticated than others; one of the simplest ("grAP"), suggested in Moffat et al. 2022 (see reference below), uses literally the same formula as "traditional" AP

\frac {\sum_i r_i prec@i} {\sum_i r_i} = \frac {\sum_i r_i \sum_{j \leq i} r_j / i} {\sum_i r_i}

while removing the restriction on r_i's being binary labels.
It would be great if we could make this extended version of mean_average_precision aka map be usable as an alternative (or complementary) metric to ndcg for ranking tasks.

Motivation

Provides an alternative to NDCG for ranking graded items

Description

Two options:

modify the implementation of mean_average_precision aka map to accept any nonnegative-but-not-necessarily binary labels, while preserving same behavior when the input is binary
add support for a new metric "graded_average_precision" alongside mean_average_precision, catering to the broader case.

References

Alistair Moffat, Joel Mackenzie, Paul Thomas, and Leif Azzopardi. 2022. A Flexible Framework for Offline Effectiveness Metrics. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22). Association for Computing Machinery, New York, NY, USA, 578–587. https://doi.org/10.1145/3477495.3531924

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support graded (mean) average precision metric for ranking #7021

Summary

Motivation

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Support graded (mean) average precision metric for ranking #7021

Description

Summary

Motivation

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions