`
zuroc
  • 浏览: 1290255 次
  • 性别: Icon_minigender_1
  • 来自: 江苏
社区版块
存档分类
最新评论

Tokyo Dystopia 全文搜索

 
阅读更多
http://d.hatena.ne.jp/perezvon/20080921/1222016246


>>> from tokyodystopia import TokyoDystopia
>>> db = TokyoDystopia("/tmp/test.db", 255)
>>> db.put(0, u"仙台".encode("utf8"), " ")
1
>>> db.put(1, u"仙台 広島".encode("utf8"), " ")
1
>>> db.put(2, u"広島 山形 湘南 山形".encode("utf8"), " ")
1
>>> db.search(u"仙台".encode("utf-8"))
2
[0L, 1L]
>>> db.search(u"広島".encode("utf-8"))
2
[1L, 2L]
>> db.close()


The function `tcidbsearch2' searches with a compound expression. In the compound expression, tokens are separated by one or more white space characters. If one token is specified, records including the specified pattern are searched for. Upper or lower case is not distinguished. Accent marks and diacritical marks are ignored. If two or more tokens are specified, records including all of the patterns are searched for. The compound expression includes the following sub expressions.

    * A B : searches for records including the two tokens.
    * A && B : searches for records including the two tokens.
    * A || B : searches for records including the one or both of the two tokens.
    * "A B..." : searches for records including the phrase.
    * [A] : searches for records including words exactly matching the token.
    * [A*] : searches for records including words beginning with the token.
    * [*A] : searches for records including words ending with the token.
    * [[A : searches for records beginning with the token.
    * A]] : searches for records ending with the token.

Note that the priority of "||" is higher than the one of "&&".

2
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics