最近在学习Lucene,官方版本已经更新至5.0,网址:http://lucene.apache.org/
Lucene官网 写道
The Apache LuceneTM project develops open-source search software, including:
1.Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
2.Solr, is a high performance search server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.
3.Open Relevance Project, is a subproject with the aim of collecting and distributing free materials for relevance testing and performance.
4.PyLucene, is a Python port of the Core project.
Lucene Core是最核心的内容。1.Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
2.Solr, is a high performance search server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.
3.Open Relevance Project, is a subproject with the aim of collecting and distributing free materials for relevance testing and performance.
4.PyLucene, is a Python port of the Core project.
provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
Lucene 是基于Java基础的索引和搜索技术,包括拼写检查,高亮显示和高端的分词能力。它不是一个完全的应用,而是提供了一项能力,一种技术,JAVA搜索技术的一个解决方案。
Lucene的技术我就不详细介绍了,百度百科上有.
接下来我为大家介绍,我使用Lucene设计和开发了一个简易文本文件搜索的Demo,主要包括以下几个方面:
- 环境准备
使用到的jar包:见附件。
- 主体思路
1.获取需要被索引的数据
File[] files = files2Index.listFiles(new FilenameFilter() { @Override public boolean accept(File dir, String name) { return name.endsWith("txt"); } });
2.使用Lucene创建索引
IndexWriter indexWriter = getIndexWriter(); BufferedReader br = null; String line = null; StringBuilder sb = null; for (File file : files) { // 创建txt文件的索引,包括名称和内容 Document doc = new Document(); doc.add(new TextField(NAME, file.getName(), Store.YES)); try { br = new BufferedReader(new FileReader(file)); sb = new StringBuilder(); while ((line = br.readLine()) != null) { sb.append(line); } doc.add(new TextField(CONTENT, sb.toString(), Store.YES)); indexWriter.addDocument(doc); indexWriter.commit(); br.close(); } catch (FileNotFoundException e) { // TODO Auto-generated catch block // log } catch (IOException e) { // TODO Auto-generated catch block // log } }
/** * 索引写入类 * * @return */ private IndexWriter getIndexWriter() { IndexWriter indexWriter = null; try { indexWriter = new IndexWriter(FSDirectory.open(indexDir.toPath()), new IndexWriterConfig(new SmartChineseAnalyzer())); } catch (IOException e) { // TODO Auto-generated catch block // log } return indexWriter; }
3.根据索引查询内容
public List<String> getFoundFileNames(String queryContent) { ScoreDoc[] scoreDocs = queryIndex(queryContent); List<String> results = new ArrayList<String>(); Set<String> fields = new HashSet<String>(); fields.add(NAME); fields.add(CONTENT); for (ScoreDoc scDoc : scoreDocs) { try { Document resDoc = indexSearcher.doc(scDoc.doc, fields); results.add(resDoc.getValues(NAME)[0]); } catch (IOException e) { // TODO Auto-generated catch block // log } } return results; } private ScoreDoc[] queryIndex(String queryContent) { try { // 索引搜索 indexSearcher = new IndexSearcher(DirectoryReader.open(FSDirectory .open(indexDir.toPath()))); // 查询内容转换器 QueryParser parser = new QueryParser("", new SmartChineseAnalyzer()); return indexSearcher.search(parser.parse(queryContent), MAX_COUNT).scoreDocs; } catch (IOException e) { // TODO Auto-generated catch block // log } catch (ParseException e) { // TODO Auto-generated catch block // log } return null; }
PS:首先需要在对应的目录下面创建一些TXT文件,索引目录如果不存在会自动创建文件夹
测试代码:
@Test public void test01() { IndexFile file = new IndexFile(new File("E:\\APP\\luceneTest\\文本文件"), new File("E:\\APP\\luceneTest\\indexs\\01")); file.createIndex(); System.out.println("01:" + file.getFoundFileNames("NAME:\"文本\"")); System.out.println("01:" + file.getFoundFileNames("NAME:\"txt\"")); } @Test public void test02() { IndexFile file = new IndexFile(new File("E:\\APP\\luceneTest\\文本文件"), new File("E:\\APP\\luceneTest\\indexs\\02")); file.createIndex(); System.out.println("02:" + file.getFoundFileNames("CONTENT:\"我\"")); System.out.println("02:" + file.getFoundFileNames("CONTENT:\"开发\"")); }
相关推荐
lucene简单demo lucene简单demo lucene简单demo lucene简单demo
lucene简单Demo,附带的文件夹考进C盘就OK
可用lucene demo 已经有入门级pdf学习
lcene实战(第2版)》基于apache的lucene3.0,从lucene核心、lucene应用、案例分析3个方面详细系统地介绍了lucene,包括认识lucene、建立索引、为应用程序添加搜索功能、高级搜索技术、扩展搜索、使用tika提取文本、...
Lucene各版本间变化较大,lucene官方的turtial里面很多还是lucene-3.x.x的版本,这是Lucene实战(中文版第二版)对应Lucene版本,有需要的拿去用。
Lucene实战(中文版),lucene是apache的用于构建搜索引擎的开源框架,本书由该项目的维护者撰写,是学习、使用lucene的经典图书
本压缩包的主要内容是Lucene分词器的demo版本,可以导入到程序中直接使用,包含Lucene分词使用的pom文件,使用前请注意修改存储地址。
《Lucene实战(第2版)》基于apache的Lucene3.0,从Lucene核心、Lucene应用、案例分析3个方面详细系统地介绍了Lucene,包括认识Lucene、建立索引、为应用程序添加搜索功能、高级搜索技术、扩展搜索、使用tika提取...
Lucene实战,了解Lucene开源系统并付诸实践的经典之作,很快你就能上手写一个简单的操作系统
Lucene实战(第2版).pdf
Lucene实战(第2版) Lucene是apache软件基金会4 jakarta项目组的一个子项目,是一个开放源代码的全文检索引擎工具包,即它不是一个完整的全文检索引擎,而是一个全文检索引擎的架构,提供了完整的查询引擎和索引...
lucene3.5全文检索案例lucene+demo
Lucene实战(第二版)源代码
Lucene入门demo,lucene简单的应用
Lucene实战(第二版)源代码
全文索引工具 开源工具 java编写 lucene的简单demo
Lucene实战中文版(第2版)_cn(带目录)