Learned sparse retrieval

Learned sparse retrieval or sparse neural search is an approach to text search which uses a sparse vector representation of queries and documents. It borrows techniques both from lexical bag-of-words and vector embedding algorithms, and is claimed to perform better than either alone. The best-known sparse neural search system are SPLADE^[1] and SPLADE v2.^[2] Some implementations of SPLADE have similar latency to Okapi BM25 lexical search while giving as good results as state-of-the-art neural rankers on in-domain data.^[3]

SPLADE is released under a Creative Commons NonCommercial license.^[4]

External links

SPLADE code base at github

Notes

^ Formal, Thibault; Piwowarski, Benjamin; Clinchant, Stéphane (2021-07-11). "SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking". Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '21. New York, NY, USA: Association for Computing Machinery: 2288–2292. arXiv:2107.05720. doi:10.1145/3404835.3463098. ISBN 978-1-4503-8037-9.
^ Formal, Thibault; Piworwarski, Benjamin; Lassance, Carlos; Clinchant, Stéphane (21 September 2021). "SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval" (PDF). arXiv. arXiv:2109.10086v1.
^ Lassance, Carlos; Clinchant, Stéphane (2022-07-07). "An Efficiency Study for SPLADE Models". Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '22. New York, NY, USA: Association for Computing Machinery: 2220–2226. arXiv:2207.03834. doi:10.1145/3477495.3531833. ISBN 978-1-4503-8732-3.
^ "splade/LICENSE at main · naver/splade". GitHub. Retrieved 2023-08-25.

This computer science article is a stub. You can help Wikipedia by expanding it.

This computational linguistics-related article is a stub. You can help Wikipedia by expanding it.

[1] Formal, Thibault; Piwowarski, Benjamin; Clinchant, Stéphane (2021-07-11). "SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking". Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '21. New York, NY, USA: Association for Computing Machinery: 2288–2292. arXiv:2107.05720. doi:10.1145/3404835.3463098. ISBN 978-1-4503-8037-9.

[2] Formal, Thibault; Piworwarski, Benjamin; Lassance, Carlos; Clinchant, Stéphane (21 September 2021). "SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval" (PDF). arXiv. arXiv:2109.10086v1.

[3] Lassance, Carlos; Clinchant, Stéphane (2022-07-07). "An Efficiency Study for SPLADE Models". Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '22. New York, NY, USA: Association for Computing Machinery: 2220–2226. arXiv:2207.03834. doi:10.1145/3477495.3531833. ISBN 978-1-4503-8732-3.

[4] "splade/LICENSE at main · naver/splade". GitHub. Retrieved 2023-08-25.

[1]

[2]

[3]

[4]