跳转到内容

Lucene:修订间差异

维基百科,自由的百科全书
删除的内容 添加的内容
Antigng-bot留言 | 贡献
using fix.exe by Antigng
 
(未显示12个用户的24个中间版本)
第1行: 第1行:
{{NoteTA|G1=IT|G2=FL}}
{{Infobox software
{{Infobox software
| name = Lucene
| name = Lucene
| logo = [[File:Lucene_logo_green_300.png|160px]]
| logo = Lucene_logo_green_300.png
| logo size = 160px
| screenshot =
| screenshot =
| caption =
| caption =
| developer = [[Apache Software Foundation]]
| developer = [[Apache软件基金会]]
| released = {{Start date and age|1999}}
| status = 活跃
| latest release version = 5.3.0
| latest release version = 9.4.1
| latest release date = {{release date|2015|08|24}}
| latest release date = {{Start date and age|2022|10|24}}
| operating system = [[Cross-platform]]
| programming language = [[Java]]
| operating system = [[跨平臺]]
| programming language = [[Java (programming language)|Java]]
| genre = [[搜索]]及[[全文检索]]
| genre = [[搜索算法|搜索]]及[[全文检索]]
| license = [[Apache许可证]] 2.0
| license = [[Apache许可证]] 2.0
| website = {{URL|http://lucene.apache.org}}
| website = {{URL|https://lucene.apache.org/}}
}}
}}


'''Lucene'''是一套用于[[全文检索]]和[[搜尋]]的[[開放源碼]][[程式庫]],由[[Apache软件基金会]]支持和提供。Lucene提供了一個簡單卻強大的應用程式介面,能夠做全文索引和搜索。Lucene被广泛应用作搜索应用的标准基础库<ref>{{Citation|last1=Kamphuis|first1=Chris|title=Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants|date=2020|url=http://link.springer.com/10.1007/978-3-030-45442-5_4|journal=Advances in Information Retrieval|volume=12036|pages=28–34|editor-last=Jose|editor-first=Joemon M.|place=Cham|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-45442-5_4|isbn=978-3-030-45441-8|pmc=7148026|access-date=2021-06-07|last2=de Vries|first2=Arjen P.|last3=Boytsov|first3=Leonid|last4=Lin|first4=Jimmy|editor2-last=Yilmaz|editor2-first=Emine|editor3-last=Magalhães|editor3-first=João|editor4-last=Castells|editor4-first=Pablo}}</ref><ref>{{Citation|last1=Grand|first1=Adrien|title=From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance|date=2020|url=http://link.springer.com/10.1007/978-3-030-45442-5_3|journal=Advances in Information Retrieval|volume=12036|pages=20–27|editor-last=Jose|editor-first=Joemon M.|place=Cham|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-45442-5_3|isbn=978-3-030-45441-8|pmc=7148045|access-date=2021-06-07|last2=Muir|first2=Robert|last3=Ferenczi|first3=Jim|last4=Lin|first4=Jimmy|editor2-last=Yilmaz|editor2-first=Emine|editor3-last=Magalhães|editor3-first=João|editor4-last=Castells|editor4-first=Pablo}}</ref><ref>{{Cite journal|last1=Azzopardi|first1=Leif|last2=Moshfeghi|first2=Yashar|last3=Halvey|first3=Martin|last4=Alkhawaldeh|first4=Rami S.|last5=Balog|first5=Krisztian|last6=Di Buccio|first6=Emanuele|last7=Ceccarelli|first7=Diego|last8=Fernández-Luna|first8=Juan M.|last9=Hull|first9=Charlie|last10=Mannix|first10=Jake|last11=Palchowdhury|first11=Sauparna|date=2017-02-14|title=Lucene4IR: Developing Information Retrieval Evaluation Resources using Lucene|url=https://dl.acm.org/doi/10.1145/3053408.3053421|journal=ACM SIGIR Forum|language=en|volume=50|issue=2|pages=58–75|doi=10.1145/3053408.3053421|issn=0163-5840|access-date=2022-07-25|archive-date=2022-07-28|archive-url=https://web.archive.org/web/20220728084042/https://dl.acm.org/doi/10.1145/3053408.3053421|dead-url=no}}</ref>。
'''Lucene'''是一套用于[[全文检索]]和[[搜尋]]的[[開放源碼]][[程式庫]],由[[Apache软件基金会]]支持和提供。Lucene提供了一個簡單卻強大的應用程式介面,能夠做全文索引和搜尋,在[[Java]]开发环境裡Lucene是一個成熟的免費開放原始碼工具;就其本身而論,Lucene是現在並且是這幾年,最受歡迎的免費Java資訊檢索程式庫。

Lucene现已被移植到其他编程语言,包括[[Object Pascal]]、[[Perl]]、[[C#]]、[[C++]]、[[Python]]、[[Ruby]]和[[PHP]]<ref name="port">{{cite web|url=http://wiki.apache.org/lucene-java/LuceneImplementations|title=LuceneImplementations|work=apache.org|access-date=23 September 2015|url-status=live|archive-url=https://web.archive.org/web/20151006021755/http://wiki.apache.org/lucene-java/LuceneImplementations|archive-date=6 October 2015}}</ref>。


== 历史 ==
== 历史 ==
Doug Cutting在1999年编写了Lucene<ref>KeywordAnalyzer
Lucene最初是由Doug Cutting所撰寫的,他是一位資深的全文索引及檢索專家,曾經是V-Twin搜索引擎的主要開發者,後來在Excite擔任高級系統架構設計師,目前從事於一些INTERNET底層架構的研究。他貢獻出Lucene的目標是為各種中小型應用程式加入全文檢索功能。
{{cite web |url=http://trijug.org/downloads/TriJug-11-07.pdf |title=Better Search with Apache Lucene and Solr |date=19 November 2007 |url-status=dead |archive-url=https://web.archive.org/web/20120131154001/http://trijug.org/downloads/TriJug-11-07.pdf |archive-date=31 January 2012}}</ref>,他是一位資深的全文索引及檢索專家,曾經是V-Twin搜索引擎的主要開發者,後來在Excite擔任高級系統架構設計師,目前從事於一些互联网底層架構的研究。他貢獻出Lucene的目標是為各種中小型應用程式加入全文檢索功能。Lucene最初可以从[[SourceForge]]网站的主页下载,它于2001年9月加入Apache软件基金会的Jakarta开源Java产品家族,并于2005年2月成为独立的顶级Apache项目。Lucene这个名字是Doug Cutting妻子的中间名,也是她外祖母的名字<ref>{{cite book |title=Web Content Management |url=https://archive.org/details/webcontentmanage0000bark |last1= Barker |first1=Deane |year=2016 |publisher=O'Reilly |isbn=978-1491908105 |page=[https://archive.org/details/webcontentmanage0000bark/page/233 233] }}</ref>。

Lucene以前包含了许多子项目,例如 Lucene.NET、Mahout、Tika and Nutch。这三个现在已经成为了独立的顶级Apache项目。

2010年3月,[[Apache Solr]]搜索服务器作为Lucene子项目加入,合并了开发者社区。

4.0版于2012年10月12日发布<ref name="apache.org">{{cite web|url = https://lucene.apache.org/|title = Apache Lucene - Welcome to Apache Lucene|work = apache.org|access-date = 4 February 2016|url-status = live|archive-url = https://web.archive.org/web/20160204002101/https://lucene.apache.org/|archive-date = 4 February 2016}}</ref>。

2021年3月,Lucene更改了logo,[[Apache Solr]]再次成为顶级Apache项目,独立于 Lucene。

== 功能和常见用途 ==
虽然理论上Lucene适用于任何需要全文索引和搜索功能的应用程序,但其主要是因为在[[Internet搜索引擎]]和本地单站点搜索实现中的实用性而受到认可<ref>{{cite book|url=https://archive.org/details/luceneactionseco00hatc|url-access=limited|title=Lucene in Action, Second Edition|last1=McCandless|first1=Michael|last2=Hatcher|first2=Erik|last3=Gospodnetić|first3=Otis|publisher=Manning|year=2010|isbn=978-1933988177|page=[https://archive.org/details/luceneactionseco00hatc/page/n46 8]}}</ref><ref>{{cite web|url=http://www.glscube.org/downloads/glscube_design.pdf|title=GNU/Linux Semantic Storage System|website=glscube.org|archive-url=https://web.archive.org/web/20100601210729/http://www.glscube.org/downloads/glscube_design.pdf|archive-date=2010-06-01|url-status=dead}}</ref>。

Lucene包含了基于[[编辑距离]]执行模糊搜索的功能<ref>{{cite web|url=https://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Fuzzy+Searches|title=Apache Lucene - Query Parser Syntax|website=lucene.apache.org|url-status=live|archive-url=https://web.archive.org/web/20170502011748/http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Fuzzy+Searches|archive-date=2017-05-02}}</ref>。

Lucene也被用于实现推荐系统<ref>J. Beel, S. Langer, and B. Gipp, “The Architecture and Datasets of Docear’s Research Paper Recommender System,” in Proceedings of the 3rd International Workshop on Mining Scientific Publications (WOSP 2014) at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014), London, UK, 2014</ref>。例如,Lucene的'MoreLikeThis'类可以生成相似文档的推荐。在将“MoreLikeThis”基于向量的相似性方法与基于引用的文档相似性度量(例如共引和共引邻近度分析)进行比较时,Lucene的方法在推荐具有非常相似的结构特征和更窄相关性的文档上表现出色<ref name="Schwarzer16">M. Schwarzer, M. Schubotz, N. Meuschke, C. Breitinger, [[Volker Markl|V. Markl]], and B. Gipp, https://www.gipp.com/wp-content/papercite-data/pdf/schwarzer2016.pdf {{Wayback|url=https://www.gipp.com/wp-content/papercite-data/pdf/schwarzer2016.pdf |date=20211117132321 }} "Evaluating Link-based Recommendations for Wikipedia" in Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), New York, NY, USA, 2016, pp. 191-200.</ref>。相比之下,基于引用的文档相似性度量往往更适合推荐更广泛相关的文档<ref name="Schwarzer16" />。

== 基於Lucene的項目 ==
Lucene本身只是一个索引和搜索库,不包含[[爬取]]和HTML解析功能。但是,以下的项目扩展了Lucene的功能:

* Apache Nutch — 提供成熟可用的网络爬虫<ref>{{Cite web|url=https://nutch.apache.org/|title=Apache Nutch™ -|accessdate=2016-11-29|last=dev@Nutch.apache.org|work=nutch.apache.org|archive-date=2021-01-26|archive-url=https://web.archive.org/web/20210126153558/https://nutch.apache.org/|dead-url=no}}</ref>
* [[Solr|Apache Solr]] — 基于Lucene核心的高性能搜索服务器,提供JSON/Python/Ruby API<ref name="quora">{{cite web|url=https://www.quora.com/What-are-the-main-differences-between-ElasticSearch-Apache-Solr-and-SolrCloud|title=What are the main differences between ElasticSearch, Apache Solr and SolrCloud? - Quora|accessdate=23 September 2015|work=quora.com}}</ref>
* [[Compass Project|Compass]] – [[Elasticsearch]]的前身<ref>{{cite web|url=http://thedudeabides.com/articles/the_future_of_compass/|title=The Future of Compass & Elasticsearch|website=the dude abides|language=en|access-date=2015-10-14|url-status=dead|archive-url=https://web.archive.org/web/20151015021211/http://thedudeabides.com/articles/the_future_of_compass/|archive-date=2015-10-15}}</ref>
* [[CrateDB]] – 基于Lucene构建的开源分布式SQL数据库<ref>{{cite news|url=http://www.infoworld.com/article/2984469/database/11-cutting-edge-databases-worth-exploring-now.html|title=11 cutting-edge databases worth exploring now|last=Wayner|first=Peter|access-date=21 September 2015|publisher=InfoWorld|url-status=live|archive-url=https://web.archive.org/web/20150921214828/http://www.infoworld.com/article/2984469/database/11-cutting-edge-databases-worth-exploring-now.html|archive-date=21 September 2015}}</ref>
* DocFetcher — 跨平台的本机文件搜索桌面程序{{citation needed|date=June 2015}}<ref>{{Cite web|url=http://docfetcher.sourceforge.net/en/index.html|title=DocFetcher - Fast Document Search|accessdate=2016-11-29|last=Quang|first=Tran Nam|work=docfetcher.sourceforge.net|archive-date=2021-01-13|archive-url=https://web.archive.org/web/20210113062648/http://docfetcher.sourceforge.net/en/index.html|dead-url=no}}</ref>
* [[Elasticsearch]] —企业搜索平台,目的是组织数据并使其易于获取<ref>{{cite web|url=https://www.elastic.co/products/elasticsearch|title=Elasticsearch: RESTful, Distributed Search & Analytics - Elastic|accessdate=23 September 2015|work=elastic.co|archive-date=2015-09-21|archive-url=https://web.archive.org/web/20150921201044/https://www.elastic.co/products/elasticsearch|dead-url=no}}</ref>
* Kinosearch – 用[[Perl]]和[[C语言]]<ref name="cmswire">{{cite news|url=http://www.cmswire.com/cms/enterprise-20/socialtext-updates-search-goes-kino-001037.php|title=Socialtext Updates Search, Goes Kino|last=Natividad|first=Angela|access-date=2011-05-31|publisher=CMS Wire|url-status=live|archive-url=https://web.archive.org/web/20120929122221/http://www.cmswire.com/cms/enterprise-20/socialtext-updates-search-goes-kino-001037.php|archive-date=2012-09-29}}</ref>实现的搜索引擎与Lucene的移植<ref name="test">{{cite web|url=http://p3rl.org/KinoSearch#DESCRIPTION|title=KinoSearch - Search engine library. - metacpan.org|author=Marvin Humphrey|work=p3rl.org|access-date=23 September 2015}}</ref>。[[Socialtext]] wiki<ref name="cmswire" />和[[MojoMojo]] wiki均使用了这一搜索引擎<ref name="catbook">{{cite book|title=The Definitive Guide to Catalyst|url=https://archive.org/details/definitiveguidet00dime_868|url-access=limited|last=Diment|first=Kieren|author2=Trout, Matt S|publisher=[[Apress]]|year=2009|isbn=978-1-4302-2365-8|page=[https://archive.org/details/definitiveguidet00dime_868/page/n343 280]|chapter=Catalyst Cookbook}}</ref>。 它在人类代谢物组数据库(HMDB)<ref>{{cite journal|date=January 2009|title=HMDB: a knowledgebase for the human metabolome|journal=[[Nucleic Acids Res.]]|volume=37|issue=Database issue|pages=D603–10|doi=10.1093/nar/gkn810|pmc=2686599|pmid=18953024|author1-link=David S. Wishart|last1=Wishart|first1=D. S.|last2=Knox|first2=C.|last3=Guo|first3=A. C.|last4=Eisner|first4=R.|last5=Young|first5=N.|last6=Gautam|first6=B.|last7=Hau|first7=D. D.|last8=Psychogios|first8=N.|last9=Dong|first9=E.|last10=Bouatra|first10=S.|last11=Mandal|first11=R.|last12=Sinelnikov|first12=I.|last13=Xia|first13=J.|last14=Jia|first14=L.|last15=Cruz|first15=J. A.|last16=Lim|first16=E.|last17=Sobsey|first17=C. A.|last18=Shrivastava|first18=S.|last19=Huang|first19=P.|last20=Liu|first20=P.|last21=Fang|first21=L.|last22=Peng|first22=J.|last23=Fradette|first23=R.|last24=Cheng|first24=D.|last25=Tzur|first25=D.|last26=Clements|first26=M.|last27=Lewis|first27=A.|last28=De Souza|first28=A.|last29=Zuniga|first29=A.|last30=Dawe|first30=M.|display-authors=1}}</ref> 和毒素与毒素目标数据库(T3DB)<ref>{{cite journal|date=January 2010|title=T3DB: a comprehensively annotated database of common toxins and their targets|journal=Nucleic Acids Res.|volume=38|issue=Database issue|pages=D781–6|doi=10.1093/nar/gkp934|pmc=2808899|pmid=19897546|last1=Lim|first1=Emilia|last2=Pon|first2=Allison|last3=Djoumbou|first3=Yannick|last4=Knox|first4=Craig|last5=Shrivastava|first5=Savita|last6=Guo|first6=An Chi|last7=Neveu|first7=Vanessa|last8=Wishart|first8=David S.}}</ref>中亦有应用。
* [[MongoDB]] Atlas Search – 基于MongoDB和Apache Lucene的云原生企业搜索应用程序
* [[OpenSearch]] – 基于Elasticsearch 7的开源企业级搜索服务器
* Swiftype — 基于Lucene的企业级搜索<ref>{{Cite web|url=https://swiftype.com/|title=Swiftype - Site search and enterprise search|accessdate=2016-11-29|work=Swiftype|archive-date=2021-02-05|archive-url=https://web.archive.org/web/20210205155855/https://swiftype.com/|dead-url=no}}</ref>
* Lucene.NET — 提供给.Net平台用户的Lucene类库的封装<ref>{{Cite web|url=http://lucenenet.apache.org/|title=Apache Lucene.Net|accessdate=2016-11-29|work=lucenenet.apache.org|archive-date=2020-12-31|archive-url=https://web.archive.org/web/20201231160115/http://lucenenet.apache.org/|dead-url=no}}</ref>
* Apache Lucy — 为动态语言提供全文搜索的能力,是Lucene Java 库的C接口<ref>{{Cite web|url=http://lucy.apache.org/|title=Apache Lucy|accessdate=2016-11-29|work=lucy.apache.org|archive-date=2020-12-31|archive-url=https://web.archive.org/web/20201231160109/http://lucy.apache.org/|dead-url=no}}</ref>
* Luke — Java编写的用户界面用于编辑Lucene的索引,此项目已停止开发<ref>{{Cite web|url=https://code.google.com/archive/p/luke/|title=luke|accessdate=2016-11-29|author=|date=|publisher=|website=GitHub|archive-date=2020-11-30|archive-url=https://web.archive.org/web/20201130101955/https://code.google.com/archive/p/luke|dead-url=no}}</ref>


== 參見 ==
== 參見 ==
* [[Solr]] - 使用Lucene的企業搜索服器,亦由Apache軟件基金會所研發。
* [[Solr]] - 使用Lucene的企業搜索服器,亦由Apache軟件基金會所研發。

== 參考資料 ==
{{reflist}}


== 外部連結 ==
== 外部連結 ==
{{wikibooks|Lucene}}
{{wikibooks|Lucene}}
*[http://lucene.apache.org/ Lucene homepage]
* [http://lucene.apache.org/ Lucene homepage] {{Wayback|url=http://lucene.apache.org/ |date=20081004045956 }}
*Article "[http://blog.dev.sf.net/index.php?/archives/10-Behind-the-Scenes-of-the-SourceForge.net-Search-System.html Behind the Scenes of the SourceForge.net Search System]" by [[Chris Conrad]]
* Article "[https://web.archive.org/web/20060713193801/http://blog.dev.sf.net/index.php?%2Farchives%2F10-Behind-the-Scenes-of-the-SourceForge.net-Search-System.html Behind the Scenes of the SourceForge.net Search System]" by [[Chris Conrad]]
*[http://schmidt.devlib.org/software/lucene-wikipedia.html Lucene Wikipedia indexer] — introductory article with Java code for search on [http://download.wikimedia.org/wikipedia/ Wikipedia data]
* {{cite web |archive-date=2006-07-15 |archive-url=https://web.archive.org/web/20060715234923/http://schmidt.devlib.org/software/lucene-wikipedia.html |url=http://schmidt.devlib.org/software/lucene-wikipedia.html |quote=Introductory article with Java code for search |title=Lucene Wikipedia indexer |first=Marco |last=Schmidt |date=2005 |deadurl=yes |accessdate=2021-02-07 }}
*[http://www.budget-ha.com/lucene Simple Lucene Examples]
* [https://web.archive.org/web/20070521024500/http://www.budget-ha.com/lucene Simple Lucene Examples]
* [http://apiwave.com/java/zhwiki/api/org.apache.lucene Apache Lucene popular APIs] {{Wayback|url=http://apiwave.com/java/zhwiki/api/org.apache.lucene |date=20150504201057 }} in [[GitHub]]
*[http://www.chedong.com/tech/lucene.html Lucene:基於Java的全文檢索引擎簡介]
*[http://www.zhang3li4.com 基於Lucene构建的百姓生活网]


{{Apache}}
{{Apache}}


{{Authority control}}
{{Authority control}}

[[Category:Apache软件基金会]]
[[Category:搜索]]
[[Category:Apache软件基金会项目]]
[[Category:Java]]
[[Category:搜索引擎软件]]
[[Category:Java函式庫]]
[[Category:C♯函式庫]]
[[Category:跨平台軟體]]
[[Category:使用Apache许可证的软件]]
[[Category:搜索引擎软件]]
[[Category:1999年软件]]

2024年9月5日 (四) 00:35的最新版本

Lucene
開發者Apache软件基金会
首次发布1999年,​25年前​(1999
当前版本9.4.1(2022年10月24日,​2年前​(2022-10-24
源代码库 編輯維基數據鏈接
编程语言Java
操作系统跨平臺
类型搜索全文检索
许可协议Apache许可证 2.0
网站lucene.apache.org

Lucene是一套用于全文检索搜尋開放源碼程式庫,由Apache软件基金会支持和提供。Lucene提供了一個簡單卻強大的應用程式介面,能夠做全文索引和搜索。Lucene被广泛应用作搜索应用的标准基础库[1][2][3]

Lucene现已被移植到其他编程语言,包括Object PascalPerlC#C++PythonRubyPHP[4]

历史

[编辑]

Doug Cutting在1999年编写了Lucene[5],他是一位資深的全文索引及檢索專家,曾經是V-Twin搜索引擎的主要開發者,後來在Excite擔任高級系統架構設計師,目前從事於一些互联网底層架構的研究。他貢獻出Lucene的目標是為各種中小型應用程式加入全文檢索功能。Lucene最初可以从SourceForge网站的主页下载,它于2001年9月加入Apache软件基金会的Jakarta开源Java产品家族,并于2005年2月成为独立的顶级Apache项目。Lucene这个名字是Doug Cutting妻子的中间名,也是她外祖母的名字[6]

Lucene以前包含了许多子项目,例如 Lucene.NET、Mahout、Tika and Nutch。这三个现在已经成为了独立的顶级Apache项目。

2010年3月,Apache Solr搜索服务器作为Lucene子项目加入,合并了开发者社区。

4.0版于2012年10月12日发布[7]

2021年3月,Lucene更改了logo,Apache Solr再次成为顶级Apache项目,独立于 Lucene。

功能和常见用途

[编辑]

虽然理论上Lucene适用于任何需要全文索引和搜索功能的应用程序,但其主要是因为在Internet搜索引擎和本地单站点搜索实现中的实用性而受到认可[8][9]

Lucene包含了基于编辑距离执行模糊搜索的功能[10]

Lucene也被用于实现推荐系统[11]。例如,Lucene的'MoreLikeThis'类可以生成相似文档的推荐。在将“MoreLikeThis”基于向量的相似性方法与基于引用的文档相似性度量(例如共引和共引邻近度分析)进行比较时,Lucene的方法在推荐具有非常相似的结构特征和更窄相关性的文档上表现出色[12]。相比之下,基于引用的文档相似性度量往往更适合推荐更广泛相关的文档[12]

基於Lucene的項目

[编辑]

Lucene本身只是一个索引和搜索库,不包含爬取和HTML解析功能。但是,以下的项目扩展了Lucene的功能:

  • Apache Nutch — 提供成熟可用的网络爬虫[13]
  • Apache Solr — 基于Lucene核心的高性能搜索服务器,提供JSON/Python/Ruby API[14]
  • CompassElasticsearch的前身[15]
  • CrateDB – 基于Lucene构建的开源分布式SQL数据库[16]
  • DocFetcher — 跨平台的本机文件搜索桌面程序[來源請求][17]
  • Elasticsearch —企业搜索平台,目的是组织数据并使其易于获取[18]
  • Kinosearch – 用PerlC语言[19]实现的搜索引擎与Lucene的移植[20]Socialtext wiki[19]MojoMojo wiki均使用了这一搜索引擎[21]。 它在人类代谢物组数据库(HMDB)[22] 和毒素与毒素目标数据库(T3DB)[23]中亦有应用。
  • MongoDB Atlas Search – 基于MongoDB和Apache Lucene的云原生企业搜索应用程序
  • OpenSearch – 基于Elasticsearch 7的开源企业级搜索服务器
  • Swiftype — 基于Lucene的企业级搜索[24]
  • Lucene.NET — 提供给.Net平台用户的Lucene类库的封装[25]
  • Apache Lucy — 为动态语言提供全文搜索的能力,是Lucene Java 库的C接口[26]
  • Luke — Java编写的用户界面用于编辑Lucene的索引,此项目已停止开发[27]

參見

[编辑]
  • Solr - 使用Lucene的企業搜索伺服器,亦由Apache軟件基金會所研發。

參考資料

[编辑]
  1. ^ Kamphuis, Chris; de Vries, Arjen P.; Boytsov, Leonid; Lin, Jimmy, Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo , 编, Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants, Advances in Information Retrieval (Cham: Springer International Publishing), 2020, 12036: 28–34 [2021-06-07], ISBN 978-3-030-45441-8, PMC 7148026可免费查阅, doi:10.1007/978-3-030-45442-5_4 (英语) 
  2. ^ Grand, Adrien; Muir, Robert; Ferenczi, Jim; Lin, Jimmy, Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo , 编, From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance, Advances in Information Retrieval (Cham: Springer International Publishing), 2020, 12036: 20–27 [2021-06-07], ISBN 978-3-030-45441-8, PMC 7148045可免费查阅, doi:10.1007/978-3-030-45442-5_3 (英语) 
  3. ^ Azzopardi, Leif; Moshfeghi, Yashar; Halvey, Martin; Alkhawaldeh, Rami S.; Balog, Krisztian; Di Buccio, Emanuele; Ceccarelli, Diego; Fernández-Luna, Juan M.; Hull, Charlie; Mannix, Jake; Palchowdhury, Sauparna. Lucene4IR: Developing Information Retrieval Evaluation Resources using Lucene. ACM SIGIR Forum. 2017-02-14, 50 (2): 58–75 [2022-07-25]. ISSN 0163-5840. doi:10.1145/3053408.3053421. (原始内容存档于2022-07-28) (英语). 
  4. ^ LuceneImplementations. apache.org. [23 September 2015]. (原始内容存档于6 October 2015). 
  5. ^ KeywordAnalyzer Better Search with Apache Lucene and Solr (PDF). 19 November 2007. (原始内容 (PDF)存档于31 January 2012). 
  6. ^ Barker, Deane. Web Content Management. O'Reilly. 2016: 233. ISBN 978-1491908105. 
  7. ^ Apache Lucene - Welcome to Apache Lucene. apache.org. [4 February 2016]. (原始内容存档于4 February 2016). 
  8. ^ McCandless, Michael; Hatcher, Erik; Gospodnetić, Otis. Lucene in Action, Second Edition有限度免费查阅,超限则需付费订阅. Manning. 2010: 8. ISBN 978-1933988177. 
  9. ^ GNU/Linux Semantic Storage System (PDF). glscube.org. (原始内容 (PDF)存档于2010-06-01). 
  10. ^ Apache Lucene - Query Parser Syntax. lucene.apache.org. (原始内容存档于2017-05-02). 
  11. ^ J. Beel, S. Langer, and B. Gipp, “The Architecture and Datasets of Docear’s Research Paper Recommender System,” in Proceedings of the 3rd International Workshop on Mining Scientific Publications (WOSP 2014) at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014), London, UK, 2014
  12. ^ 12.0 12.1 M. Schwarzer, M. Schubotz, N. Meuschke, C. Breitinger, V. Markl, and B. Gipp, https://www.gipp.com/wp-content/papercite-data/pdf/schwarzer2016.pdf页面存档备份,存于互联网档案馆) "Evaluating Link-based Recommendations for Wikipedia" in Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), New York, NY, USA, 2016, pp. 191-200.
  13. ^ dev@Nutch.apache.org. Apache Nutch™ -. nutch.apache.org. [2016-11-29]. (原始内容存档于2021-01-26). 
  14. ^ What are the main differences between ElasticSearch, Apache Solr and SolrCloud? - Quora. quora.com. [23 September 2015]. 
  15. ^ The Future of Compass & Elasticsearch. the dude abides. [2015-10-14]. (原始内容存档于2015-10-15) (英语). 
  16. ^ Wayner, Peter. 11 cutting-edge databases worth exploring now. InfoWorld. [21 September 2015]. (原始内容存档于21 September 2015). 
  17. ^ Quang, Tran Nam. DocFetcher - Fast Document Search. docfetcher.sourceforge.net. [2016-11-29]. (原始内容存档于2021-01-13). 
  18. ^ Elasticsearch: RESTful, Distributed Search & Analytics - Elastic. elastic.co. [23 September 2015]. (原始内容存档于2015-09-21). 
  19. ^ 19.0 19.1 Natividad, Angela. Socialtext Updates Search, Goes Kino. CMS Wire. [2011-05-31]. (原始内容存档于2012-09-29). 
  20. ^ Marvin Humphrey. KinoSearch - Search engine library. - metacpan.org. p3rl.org. [23 September 2015]. 
  21. ^ Diment, Kieren; Trout, Matt S. Catalyst Cookbook. The Definitive Guide to Catalyst有限度免费查阅,超限则需付费订阅. Apress. 2009: 280. ISBN 978-1-4302-2365-8. 
  22. ^ Wishart, D. S.; et al. HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. January 2009, 37 (Database issue): D603–10. PMC 2686599可免费查阅. PMID 18953024. doi:10.1093/nar/gkn810. 
  23. ^ Lim, Emilia; Pon, Allison; Djoumbou, Yannick; Knox, Craig; Shrivastava, Savita; Guo, An Chi; Neveu, Vanessa; Wishart, David S. T3DB: a comprehensively annotated database of common toxins and their targets. Nucleic Acids Res. January 2010, 38 (Database issue): D781–6. PMC 2808899可免费查阅. PMID 19897546. doi:10.1093/nar/gkp934. 
  24. ^ Swiftype - Site search and enterprise search. Swiftype. [2016-11-29]. (原始内容存档于2021-02-05). 
  25. ^ Apache Lucene.Net. lucenenet.apache.org. [2016-11-29]. (原始内容存档于2020-12-31). 
  26. ^ Apache Lucy. lucy.apache.org. [2016-11-29]. (原始内容存档于2020-12-31). 
  27. ^ luke. GitHub. [2016-11-29]. (原始内容存档于2020-11-30). 

外部連結

[编辑]