`
hz_chenwenbiao
  • 浏览: 995271 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

hibernate-search-3.3.0.Final中文文档翻译及学习笔记(转)

阅读更多

开始只是自己看,没想到要翻译,从第四章开始进行翻译,主要章节基本全部进行了翻译。文档中前面是英文,后面是中文翻译,一一对应。

5、Tuning Lucene indexing performance. 2

ch4. 3

4.3. Analysis 4

4.4. Bridges 4

4.4.1. Built-in bridges 4

4.4.2. Custom bridges 5

Important 9

4.5. Providing your own id. 13

4.6. Programmatic API 13

Chapter 5. Querying. 13

Note. 15

5.1. Building queries 15

5.1.1. Building a Lucene query using the Lucene API 15

5.1.2. Building a Lucene query with the Hibernate Search query DSL. 15

Note. 18

Note. 20

5.1.3. Building a Hibernate Search query. 24

Tip. 26

Chapter 6. Manual index changes 28

6.1. Adding instances to the index. 28

6.2. Deleting instances from the index. 29

Note. 30

6.3. Rebuilding the whole index. 30

6.3.1. Using flushToIndexes() 31

Note. 32

6.3.2. Using a MassIndexer 32

Warning. 33

Tip. 34

Note. 34

Chapter 7. Index Optimization. 35

7.1. Automatic optimization. 36

7.2. Manual optimization. 36

Note. 37

7.3. Adjusting optimization


1

You  can  think  of  those  two  batch  modes  (no  scope  vs  transactional)  as  the  equivalent  of the (infamous) autocommit vs  transactional behavior. From a performance perspective,  the  intransaction mode is recommended. The scoping choice is made transparently. Hibernate Search detects the presence of a transaction and adjust the scoping.

 

the  intransaction mode is recommended

 

2

The  good  news  is  that  Hibernate  Search  is  enabled  out  of  the  box  when  detected on  the  classpath by  Hibernate  Core.  If,  for  some  reason  you  need  to  disable  it,  set hibernate.search.autoregister_listeners  to false.  Note  that  there  is  no  performance penalty when the listeners are enabled but no entities are annotated as indexed.

 

3

By default, every  time an object  is  inserted, updated or deleted  through Hibernate, Hibernate Search updates  the according Lucene  index.  It  is sometimes desirable  to disable  that  features if either your  index  is read-only or  if  index updates are done  in a batch way  (see Section 6.3,“Rebuilding the whole index”).

 

To disable event based indexing, set

hibernate.search.indexing_strategy = manual

 

4

 

The different reader strategies are described in Reader strategy. Out of the box strategies are

shared: share index readers across several queries. This strategy is the most efficient.

* not-shared: create an index reader for each individual query

The default reader strategy is shared. This can be adjusted:

hibernate.search.reader.strategy = not-shared

 

5Tuning Lucene indexing performance

hibernate.search.[default|<indexname>].exclusive_index_useSet to truewhen no other process will need to write to the same index. This will enable Hibernate Search to work in exlusive mode on the index and improve performance when writing changes to the index. Default valuefalse (releases locks as soon as possible)

 

When your architecture permits it, always set hibernate.search.default.exclusive_index_use=true as it greatly improves efficiency in index writing.

 

6LockFactory configuration

 

Lucene Directorys have default locking strategies which work well for most cases, but it's possible to specify for each index managed by Hibernate Search which LockingFactory you want to use.

 

ch4

Hibernate Search的配置必须使用注解,目前不提供xml配置。

7@Indexed

Foremost we must declare a persistent class as indexable. This is done by annotating the class

with @Indexed (all entities not annotated with @Indexed will be ignored by the indexing process):

不使用@Indexed注解的实体将被忽略,即不被索引。

You can optionially specify the index attribute of the @Indexed annotation to change the default name of the index. For more information see Section 3.2, “Directory configuration”.

你可以使用“index”属性改变默认的索引名。

8@Field

For each property (or attribute) of your entity, you have the ability to describe how it will be indexed. The default (no annotation present) means that the property is ignored by the indexing process. @Field does declare a property as indexed and allows to configure several aspects of the indexing process by setting one or more of the following attributes:

你可以使用@Field来描述实体类的每一个属性。如果属性不加上@Field注解该属性将被忽略。可以使用如下的属性进一步描述@Field

name : describe under which name, the property should be stored in the Lucene Document. The default value is the property name (following the JavaBeans convention)

name:描述了存在在Lucene Document中的名称,默认使用属性的名称。

store  :  describe  whether  or  not  the  property  is  stored  in  the  Lucene  index.  You  can store  the  value Store.YES  (consuming  more  space  in  the  index  but  allowing  projection, see Section 5.1.3.5,  “Projection”), store  it  in a compressed way Store.COMPRESS  (this does consume more CPU), or avoid any storage Store.NO(this is the default value). When a property is stored, you can retrieve its original value from the Lucene Document. This is not related to whether the element is indexed or not.

store:描述了实体类的字段是否被存储在Lucene Index中。

Stroe.Yes:存储在Index中,需要更多的存储空间,但是允许projection

Store.COMPRESS:压缩存储,需要使用更多的CPU

Store.NO:不存储,默认值。

当实体的字段被存储,你可以从Lucene Document检索它的原始值,这与该元素是否被索引无关。

index: describe how  the element  is  indexed and  the  type of  information store. The different values are Index.NO  (no  indexing,  ie  cannot be  found by a query), Index.TOKENIZED  (use an  analyzer  to  process  the property),  Index.UN_TOKENIZED  (no  analyzer  pre-processing), Index.NO_NORMS (do not store the normalizationdata). The default value is TOKENIZED.

index:描述了实体的字段被索引和存储信息。

Index.NO:不被索引,因此无法通过查找查询。

Index.TOKENIZED:使用分词器进行分词并存储。

Index.UN_TOKENIZED:不进行分词。

Index.NO_NORMS:不存储标准化(normalization)数据。

注意:通常文本字段进行tokenized,时间字段不进行tokenized

Fields used for sorting must not be tokenized.(进行排序的自动必须tokenized

termVector:用来进行相似搜索。

 

4.3. Analysis

The default analyzer class used to index tokenized fields is configurable through thehibernate.search.analyzer property. The default value for this property isorg.apache.lucene.analysis.standard.StandardAnalyzer.

在同一个实体类中使用不同的Analysis是不推荐的。

 

4.4. Bridges

In Lucene all index fields have to be represented as strings. All entity properties annotated with @Field have to be converted to strings to be indexed. The reason we have not mentioned it so far is, that for most of your properties Hibernate Search does the translation job for you thanks to set of built-in bridges. However, in some cases you need a more fine grained control over the translation process.

Lucene中所有的字段被转化成相应的字符串,所有被@Field注解的字段都转换成字符串然后被索引。到目前为止我们忽略这些转换的原因是由于Hibernate Search内置的转换桥(built-in bridges)在工作。但是有些时候你需要更细粒度的控制转换的过程。

4.4.1. Built-in bridges

内置转换桥包括:nullStringDate,数值类型,urlclass

Hibernate Search comes bundled with a set of built-in bridges between a Java property type and its full text representation.

Hibernate Search 内部绑定了一些java类属性和它们对于的文本之间的转换桥。

java.lang.String

Strings are indexed as are

short, Short, integer, Integer, long, Long, float, Float, double, Double, BigInteger, BigDecimal Numbers are converted into their string representation. Note that numbers cannot be compared by Lucene (ie used in ranged queries) out of the box: they have to be padded

Using a Range query is debatable and has drawbacks, an alternative approach is to use a Filter query which will filter the result query to the appropriate range.

Hibernate Search will support a padding mechanism

数值被转化成了字符串,使用数值进行范围搜索是不推荐和具有缺陷的,可以使用过滤器解决范围搜索的问题。

 

java.utils.Date

Dates are stored as yyyyMMddHHmmssSSS in GMT time (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You shouldn't really bother with the internal format. What is important is that when using a DateRange Query, you should know that the dates have to be expressed in GMT time.

Usually, storing the date up to the millisecond is not necessary. @DateBridge defines the appropriate resolution you are willing to store in the index ( @DateBridge(resolution=Resolution.DAY) ). The date pattern will then be truncated accordingly.

@Field(index=Index.UN_TOKENIZED)
    @DateBridge(resolution=Resolution.MINUTE)
private Date date;
时间被保存到毫秒级别是没有意义的,DateBridge提供了相应的解决方案,可以精确的DayMinute以进行时间范围的搜索,时间的日期也相应的被缩减(不需要的精度被抛弃)

java.net.URI, java.net.URL

URI and URL are converted to their string representation

java.lang.Class

Class are converted to their fully qualified class name. The thread context classloader is used when the class is rehydrated

4.4.2. Custom bridges

Sometimes, the built-in bridges of Hibernate Search do not cover some of your property types, or the String representation used by the bridge does not meet your requirements. The following paragraphs describe several solutions to this problem.

有些时候内置桥不能转换你的实体类字段,或者这些转换不能满足你的要求。下面的段落将阐述几种转换的方法来解决这个问题。

4.4.2.1. StringBridge

是不是可以有附件上传时,设置字段,内容是附件的文本,以便进行附件的检索?

The simplest custom solution is to give Hibernate Search an implementation of your expected Object toString bridge. To do so you need to implement the org.hibernate.search.bridge.StringBridge interface. All implementations have to be thread-safe as they are used concurrently.

最简单的客户解决方案就是实现你所需要的Object转换成字符串的转换桥。要这样做你需要实现“org.hibernate.search.bridge.StringBridge”接口。所有的实训必须是线程安全的因为它们被并发使用。

Example 4.15. Custom StringBridge implementation

/**
 * Padding Integer bridge.
 * All numbers will be padded with 0 to match 5 digits
 *
 * @author Emmanuel Bernard
 */
public class PaddedIntegerBridge implements StringBridge {
 
    private int PADDING = 5;
 
    public String objectToString(Object object) {
        String rawInteger = ( (Integer) object ).toString();
        if (rawInteger.length() > PADDING) 
            throw new IllegalArgumentException( "Try to pad on a number too big" );
        StringBuilder paddedInteger = new StringBuilder( );
        for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) 
     {
            paddedInteger.append('0');
        }
        return paddedInteger.append( rawInteger ).toString();
    }
}                

Given the string bridge defined in Example 4.15, “Custom StringBridge implementation”, any property or field can use this bridge thanks to the @FieldBridge annotation:

上面的用户自定义的字符串转换桥可以通过@FieldBridge注解应用在所有的字段上,如:

@FieldBridge(impl = PaddedIntegerBridge.class)
private Integer length;

 

4.4.2.1.1. Parameterized bridge

Parameters can also be passed to the bridge implementation making it more flexible. Example 4.16, “Passing parameters to your bridge implementation” implements a ParameterizedBridge interface and parameters are passed through the @FieldBridge annotation.

可以通过传递参数是转换桥更具灵活性,这样需要实现“ParameterizedBridge”接口,然后通过@FieldBridge注解传递参数。

示例如下:

Example 4.16. Passing parameters to your bridge implementation

public class PaddedIntegerBridge implements StringBridge, ParameterizedBridge {
 
    public static String PADDING_PROPERTY = "padding";
    private int padding = 5; //default
 
    public void setParameterValues(Map parameters) {
        Object padding = parameters.get( PADDING_PROPERTY );
        if (padding != null) this.padding = (Integer) padding;
    }
 
    public String objectToString(Object object) {
        String rawInteger = ( (Integer) object ).toString();
        if (rawInteger.length() > padding) 
            throw new IllegalArgumentException( "Try to pad on a number too big" );
        StringBuilder paddedInteger = new StringBuilder( );
        for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) 
     {
            paddedInteger.append('0');
        }
        return paddedInteger.append( rawInteger ).toString();
    }
}
 
//property
@FieldBridge(impl = PaddedIntegerBridge.class,
             params = @Parameter(name="padding", value="10")
            )
private Integer length;                

The ParameterizedBridge interface can be implemented by StringBridgeTwoWayStringBridge,FieldBridge implementations.

All implementations have to be thread-safe, but the parameters are set during initialization and no special care is required at this stage.

接口“ParameterizedBridge”可以被StringBridgeTwoWayStringBridge,FieldBridge等实现。所有的这些实现必须是线程安全的,但是所有的参数可以在初始化时设置,并且没有需要特别注意的。

4.4.2.1.2. Type aware bridge
line-
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics