Lucene: Introduction to Lucene (Part VII-QueryParser)

DavyJones2010

浏览: 148055 次
性别:
来自: 杭州

最近访客更多访客>>

zhihaoma

xiaoji123pt

dingdaxin

vv404725784

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Lucene

Lucene QueryParser Example

1. Introduction to QueryParser

1) Sometimes we want to pass a String like this: "student AND teacher" to execue query.

It means we want to search a certain field which contains both "student" and "teacher".

We can easily use QueryParser to achieve this goal.

2) When using QuerParser, please make sure the field have to be set as "Field.Index.ANALYZED"

Please remember, when using "Field.Index.ANALYZED" when building index, the value of the field would be translated into lowercase.

When we are using common Query, we have to make sure the word we search is lower case.

But when we are using QueryParser, we don't have to care about this, as QueryParser will parse our String into lower case.

2. Example of QueryParser

	private void testBuildIndex()
	{
		List<Student> studentList = new ArrayList<Student>();
		Student student = new Student("11", "Davy", "Jones", "Male aaa Female",
				100);
		studentList.add(student);
		student = new Student("22", "Davy", "Jones", "Male bbb Female", 110);
		studentList.add(student);
		student = new Student("33", "Jones", "Davy", "Male Female", 120);
		studentList.add(student);
		student = new Student("44", "Calyp", "Jones", "Female aa bb Male", 130);
		studentList.add(student);
		student = new Student("55", "Pso", "Caly", "Female cc dd ee Male", 140);
		studentList.add(student);

		searcherUtil.buildIndex(studentList);
	}

	public void buildIndex(List<Student> studentList)
	{
		IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35,
				new SimpleAnalyzer(Version.LUCENE_35));
		IndexWriter writer = null;
		Document doc = null;
		try
		{
			writer = new IndexWriter(directory, config);
			for (Student student : studentList)
			{
				doc = new Document();
				doc.add(new Field("id", student.getId(), Field.Store.YES,
						Field.Index.NOT_ANALYZED));
				doc.add(new Field("name", student.getName(), Field.Store.YES,
						Field.Index.NOT_ANALYZED));
				doc.add(new Field("password", student.getPassword(),
						Field.Store.YES, Field.Index.NOT_ANALYZED));
				doc.add(new Field("gender", student.getGender(),
						Field.Store.YES, Field.Index.ANALYZED));
				doc.add(new NumericField("score", Field.Store.YES, true)
						.setIntValue(student.getScore()));

				writer.addDocument(doc);
			}
		} catch (CorruptIndexException e)
		{
			e.printStackTrace();
		} catch (LockObtainFailedException e)
		{
			e.printStackTrace();
		} catch (IOException e)
		{
			e.printStackTrace();
		} finally
		{
			try
			{
				writer.close();
			} catch (CorruptIndexException e)
			{
				e.printStackTrace();
			} catch (IOException e)
			{
				e.printStackTrace();
			}
		}

	}

	public void searchByQueryParser(Query query, int resultSize)
	{
		IndexSearcher searcher = getSearcher();

		try
		{
			TopDocs tds = searcher.search(query, resultSize);
			Document document = null;
			for (ScoreDoc sd : tds.scoreDocs)
			{
				document = searcher.doc(sd.doc);

				System.out.println("id = " + document.get("id") + ", name = "
						+ document.get("name") + ", password = "
						+ document.get("password") + ", gender = "
						+ document.get("gender") + ", score = "
						+ document.get("score"));
			}
		} catch (IOException e)
		{
			e.printStackTrace();
		}
	}

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		// Create instance of QueryParser
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));

		// Create instance of Query
		// Search 'gender' that contains 'Female'
		Query query = null;
		try
		{
			query = parser.parse("Female AND Male");
		} catch (ParseException e)
		{
			e.printStackTrace();
		}

		searcherUtil.searchByQueryParser(query, 100);
	}

id = 33, name = Jones, password = Davy, gender = Male Female, score = 120
id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100
id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110
id = 44, name = Calyp, password = Jones, gender = Female aa bb Male, score = 130
id = 55, name = Pso, password = Caly, gender = Female cc dd ee Male, score = 140

Comments:

1) The query we are using is not a specific Query but a Query that is created by QueryParser.

2) We can use AND, OR, NOT to organize our sql String.

3) By default, "space" means OR.

		parser.setDefaultOperator(Operator.AND);

Means we are suppressing the default "space" value and using AND to replace "space" instead.

4) By default, fieldName is defined with sentence below:

		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));

But we can use sql String below to replace default fieldName:

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		// Create instance of QueryParser
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));
		// Create instance of Query
		// Search 'gender' that contains 'Female'
		Query query = null;
		try
		{
			query = parser.parse("name: Davy");
		} catch (ParseException e)
		{
			e.printStackTrace();
		}

		searcherUtil.searchByQueryParser(query, 100);
	}

3. By default, * is not allowed as first character when using WildcardQuery OR QueryParser.

But we can make this possible by enable leading wild card = true.

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		// Create instance of QueryParser
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));
		parser.setAllowLeadingWildcard(true);
		// Create instance of Query
		// Search 'gender' that contains 'Female'
		Query query = null;
		try
		{
			query = parser.parse("name: *vy AND gender: Male*");
 		} catch (ParseException e)
		{
			e.printStackTrace();
		}

		searcherUtil.searchByQueryParser(query, 100);
	}

id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100
id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110

But how can we enable leadingWildcard when we are using WildcardQuer instead of QueryParser?

4. Query can also be parsed as TermRangQuery using QueryParser

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		// Create instance of QueryParser
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));
		// Create instance of Query
		// Search 'id' that within the range of '1' to '3'
		Query query = null;
		try
		{
			query = parser.parse("id:[1 TO 3]");
		} catch (ParseException e)
		{
			e.printStackTrace();
		}

		searcherUtil.searchByQueryParser(query, 100);
	}

id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100
id = 22, name = Davy, password = Jones, gender = Male bbb Female, score = 110

1) TO must be uppercase.

2) We can use query = parser.parse("id: {1 TO 3}"); instead of parser.parse("id: [1 TO 3]");

{} means contains left value(1) and right value(3).

[] means doesn't contain left value(1) and right value(3).

5. Query can also be parsed as PrefixQuery or TermQuery using QueryParser

1) We want to fetch the the document whose gender="Male aaa Female" as precise prefix query

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		// Create instance of QueryParser
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));
		// Create instance of Query
		// Search 'gender' that contains 'Female'
		Query query = null;
		try
		{
			query = parser.parse("gender: \"Male aaa Female\"");
		} catch (ParseException e)
		{
			e.printStackTrace();
		}

		searcherUtil.searchByQueryParser(query, 100);
	}

id = 11, name = Davy, password = Jones, gender = Male aaa Female, score = 100

6. Query cannot be parsed as NumericRangeQuery

	@Test
	public void testSearchByQueryParser()
	{
		testBuildIndex();
		
		QueryParser parser = new QueryParser(Version.LUCENE_35, "gender",
				new SimpleAnalyzer(Version.LUCENE_35));
		Query query = null;
		try
		{
			query = parser.parse("score: [100 TO 130]");
		} catch (ParseException e)
		{
			e.printStackTrace();
		}
		searcherUtil.searchByQueryParser(query, 100);
	}

Result set is empty.

In order to achieve this, we have to create custom query parser that extends QueryParser.

This will be introduced in detail in the next few chapters.

Summary:

1) QueryParser can be used to parse a certain SQL String.

2) The SQL String can be parsed into TermQuery, TermRangQuery, WildcardQuery, BooleanQuery etc, according to the SQL String.

3) The SQL String cannot be parsed into NumericRangeQuery using the QueryParser provided by Lucene.

4) There are various rules for organizing SQL String and can be parsed into different kinds of Query.

Reference Links:

1) http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html describes all the rules for QueryParser.

分享到：

DesignPattern : Singleton | Lucene: Introduction to Lucene (Part VI- ...

2013-05-25 17:04
浏览 856
评论(0)
分类:行业应用
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论