deploy carrot2-webapp
1. download soucre code
#git clone git://github.com/carrot2/carrot2.git
2.compile
#cd carrot2
#ant webapp
3.deploy
#cp tmp/webapp/carrot2-webapp.war /path/to/tomcat/webapps
4.configure carrot2
#cd /path/to/tomcat/webapps/carrot2-webapp/WEB-INF/suites
#mv suite-webapp.xml suite-webapp.xml.old
#cp source-solr.xml suite-webapp.xml
alter it like this:
<component-suite> <sources> <source component-class="org.carrot2.source.solr.SolrDocumentSource" id="solr" attribute-sets-resource="source-solr-attributes.xml"> <label>Solr</label> <title>Solr Search Engine</title> <icon-path>icons/solr.png</icon-path> <mnemonic>s</mnemonic> <description>Solr document source queries an instance of Apache Solr search engine.</description> <example-queries> <example-query>test</example-query> <example-query>solr</example-query> </example-queries> </source> </sources> <include suite="algorithm-lingo.xml"></include> </component-suite>
4. edit source-solr-attributes.xml
<attribute-sets default="overridden-attributes"> <attribute-set id="overridden-attributes"> <value-set> <label>overridden-attributes</label> <attribute key="SolrDocumentSource.serviceUrlBase"> <value type="java.lang.String" value="http://192.168.10.204:8983/inokarticle/clustering"/> </attribute> <attribute key="SolrDocumentSource.solrSummaryFieldName"> <value type="java.lang.String" value="content"/> </attribute> <attribute key="SolrDocumentSource.solrTitleFieldName"> <value type="java.lang.String" value="content"/> </attribute> </value-set> </attribute-set> </attribute-sets>
5. edit algorithm-lingo-attributes.xml algorithm-lingo.xml
----------------------------------------------------
integrate with solr
1. configure solrconfig.xml
a. import related jars
<lib dir="../contrib/clustering/lib/" regex=".*\.jar" /> <lib dir="../dist/" regex="solr-clustering-\d.*\.jar" />
b. add component adn clustering requesthandler
<searchComponent name="clustering" enable="true" class="solr.clustering.ClusteringComponent" > <lst name="engine"> <str name="name">lingo</str> <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str> <str name="carrot.resourcesDir">clustering/carrot2</str> <str name="MultilingualClustering.defaultLanguage">CHINESE_SIMPLIFIED</str> <str name="PreprocessingPipeline.tokenizerFactory">org.carrot2.text.linguistic.DefaultTokenizerFactory</str> </lst> </searchComponent> <requestHandler name="/clustering" startup="lazy" enable="true" class="solr.SearchHandler"> <lst name="defaults"> <bool name="clustering">true</bool> <str name="clustering.engine">lingo</str> <bool name="clustering.results">true</bool> <!-- Field name with the logical "title" of a each document (optional) --> <str name="carrot.title">content</str> <!-- Field name with the logical "URL" of a each document (optional) --> <str name="carrot.url">id</str> <!-- Field name with the logical "content" of a each document (optional) --> <str name="carrot.snippet">content</str> <!-- Apply highlighter to the title/ content and use this for clustering. --> <bool name="carrot.produceSummary">true</bool> <!-- the maximum number of labels per cluster --> <int name="carrot.numDescriptions">5</int> <!-- produce sub clusters --> <bool name="carrot.outputSubClusters">true</bool> <str name="MultilingualClustering.defaultLanguage">CHINESE_SIMPLIFIED</str> <!-- Configure the remaining request handler parameters. --> <str name="defType">edismax</str> <str name="q.alt">*:*</str> <str name="rows">10</str> <str name="fl">*,score</str> </lst> <arr name="last-components"> <str>clustering</str> </arr> </requestHandler>
2.custom chinese tokenizer for clustering
a. modify related carrot souce code and recompile
b. copy related jars and lexicon to solr web lib dir
Details see Apache SOLR and Carrot2 integration strategies 2
References
http://wiki.apache.org/solr/ClusteringComponent
http://www.cnblogs.com/cy163/archive/2010/05/07/1730172.html
http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html
http://download.carrot2.org/head/manual/index.html#section.advanced-topics.building-from-source-code
http://www.cnblogs.com/shm10/p/3700604.html
相关推荐
solr的carrot2需要用到的文件solr-integration-strategies-gh-pages carrot3.9webapp,还有tomcat还有solr4.81请自己下载
最新可用已配置好solr的carrot2插件,tomcat里面需配置好solr具体到http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html查看
Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook
Spring Data for Apache Solr API。 Spring Data for Apache Solr 开发文档
Apache Solr Essentials is a fast-paced guide to help you quickly learn the process of creating a scalable, efficient, and powerful search application. The book starts off by explaining the ...
This book is for developers who already know how to use Solr and are looking at procuring advanced strategies for improving their search using Solr. This book is also for people who work with ...
Apache Solr for Indexing Data
apache solr搜索系统的.Net实现
Apache Solr High Performance is a practical guide that will help you explore and take full advantage of the robust nature of Apache Solr so as to achieve optimized Solr instances, especially in terms ...
Apache Solr Search
apache solr 源文件 版本为3.6.1 让你能够更好地了解solr实现,更好的使用solr
Apache Solr 3 Enterprise Search Server 部分中文翻译 从博客上面保存下来的。是网页版,方便大家查看
apache solr 官方文档(英文原版) 包含详细的安装、Schema配置、solrConfig配置、管理页面使用等.
Mastering Apache Solr 7.x An expert guide to advancing, optimizing, and scaling your enterprise search 英文azw3 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网...
《apachesolr7官方指南》
Apache Solr 1.3.0发布,Apache Solr是一个性能强大的,基于 Lucene 的全文搜索的 开源企业级搜索服务器,拥有XML/HTTP,JSON APIs,hit highlighting, faceted search, caching, replication,web管理界面等很多功能...
Apache Solr(solr-8.11.1.tgz)Binary releases 二进制版本
You will understand the concepts and internals of Apache Solr and tune the results for your client’s search needs. The book explains each essential concept―backed by practical and industry examples...
apache solr guide 4.7