`

Lucene & Solr

阅读更多

Params of solr query (参见 solrj - CommonParams.class & solr-core - QueryParsing.class):
Apache LuceneTM 4.4.0 Documentation:
http://lucene.apache.org/core/4_4_0/index.html
http://khaidoan.wikidot.com/solr

http://www.cnblogs.com/TerryLiang/archive/2012/08/30/2664483.html
http://www.cnblogs.com/ukouryou/articles/2683463.html
引用
注意solr 中的 AND OR NOT + - 这5个 boolean operators 必须为大写,在 solr 中它们都是大小写敏感的;
q.op 指 query 的 operator,可以在 schema.xml中指定:
<solrQueryParser defaultOperator="AND"/>
如果未指定,则默认的 op 是 OR;

待续。。。
example:
        String uri = UriComponent.encode(this.solrUserQueryUrl, UriComponent.Type.QUERY);
        WebResource wr = jerseyClient.resource(uri);

        MultivaluedMap<String, String> formData = new MultivaluedMapImpl();
        formData.add("q", StringUtils.collectionToDelimitedString(userIdList, " "));
        formData.add("q.op", "OR");
        formData.add("df", "userID");
        formData.add("fq", "flag:(s OR t)");
        formData.add("fl", "userID");
        formData.add("rows", String.valueOf(userIdList.size()));
        formData.add("wt", "json");
        formData.add("indent", "true"); //only for formatting output

        String respStr = wr.type(MediaType.APPLICATION_FORM_URLENCODED).post(String.class, formData);


solr-core Constant Field Values:
http://lucene.apache.org/solr/4_1_0/solr-core/constant-values.html


查询时 AND OR 等 boolean operators 的混合使用:
http://robotlibrarian.billdueber.com/solr-and-boolean-operators/


CloudSolrServer & collection:
http://wiki.apache.org/solr/Solrj#Using_with_SolrCloud
引用
SolrJ includes a 'smart' client for SolrCloud, which is ZooKeeper aware. This means that your Java application only needs to know about your Zookeeper instances, and not where your Solr instances are, as this can be derived from ZooKeeper.
To interact with SolrCloud, you should use an instance of CloudSolrServer, and pass it your ZooKeeper host or hosts.
Beyond the instantiation of the CloudSolrServer, the behaviour should be the same as regular SolrJ.
import org.apache.solr.client.solrj.impl.CloudSolrServer;
import org.apache.solr.common.SolrInputDocument;

CloudSolrServer server = new CloudSolrServer("localhost:9983");
server.setDefaultCollection("collection1");
SolrInputDocument doc = new SolrInputDocument();
doc.addField( "id", "1234");
doc.addField( "name", "A lovely summer holiday");
server.add(doc);
server.commit();
https://issues.apache.org/jira/browse/SOLR-4046
引用
通过一个 CloudSolrServer 实例做查询时,指定 collection 的方式为
solrQuery.add(CoreAdminParams.COLLECTION, "collection_name");




Solr request 之 get & post:
使用 solrj 时,如果做查询时使用的是 post 方式做提交:
QueryResponse response = solrServer.query(solrQuery, METHOD.POST);
则传给 solr Server 的 SolrParams(superclass of SolrQuery) 就是 Form 的形式,详见 solrj.jar 的 HttpSolrServer.request(request, responseParser):
  //HttpSolrServer.request(request, responseParser) 节选
  public NamedList<Object> request(final SolrRequest request,
      final ResponseParser processor) throws SolrServerException, IOException {
  ...
      try {
      while( tries-- > 0 ) {
        // Note: since we aren't do intermittent time keeping
        // ourselves, the potential non-timeout latency could be as
        // much as tries-times (plus scheduling effects) the given
        // timeAllowed.
        try {
          if( SolrRequest.METHOD.GET == request.getMethod() ) {
            if( streams != null ) {
              throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "GET can't send streams!" );
            }
            method = new HttpGet( baseUrl + path + ClientUtils.toQueryString( params, false ) );
          }
          else if( SolrRequest.METHOD.POST == request.getMethod() ) {

            String url = baseUrl + path;
            boolean isMultipart = ( streams != null && streams.size() > 1 );

            LinkedList<NameValuePair> postParams = new LinkedList<NameValuePair>();
            if (streams == null || isMultipart) {
              HttpPost post = new HttpPost(url);
              post.setHeader("Content-Charset", "UTF-8");
              if (!this.useMultiPartPost && !isMultipart) {
                post.addHeader("Content-Type",
                    "application/x-www-form-urlencoded; charset=UTF-8");
              }

              List<FormBodyPart> parts = new LinkedList<FormBodyPart>();
              Iterator<String> iter = params.getParameterNamesIterator();
              while (iter.hasNext()) {
                String p = iter.next();
                String[] vals = params.getParams(p);
                if (vals != null) {
                  for (String v : vals) {
                    if (this.useMultiPartPost || isMultipart) {
                      parts.add(new FormBodyPart(p, new StringBody(v, Charset.forName("UTF-8"))));
                    } else {
                      postParams.add(new BasicNameValuePair(p, v));
                    }
                  }
                }
              }

              if (isMultipart) {
                for (ContentStream content : streams) {
                  String contentType = content.getContentType();
                  if(contentType==null) {
                    contentType = "application/octet-stream"; // default
                  }
                  parts.add(new FormBodyPart(content.getName(), 
                       new InputStreamBody(
                           content.getStream(), 
                           contentType, 
                           content.getName())));
                }
              }
              
              if (parts.size() > 0) {
                MultipartEntity entity = new MultipartEntity(HttpMultipartMode.STRICT);
                for(FormBodyPart p: parts) {
                  entity.addPart(p);
                }
                post.setEntity(entity);
              } else {
                //not using multipart
                post.setEntity(new UrlEncodedFormEntity(postParams, "UTF-8"));
              }

              method = post;
  ...
  }
http://wiki.apache.org/solr/ContentStream
引用
If the contentType is "application/x-www-form-urlencoded" the full POST body is parsed as parameters and inlcuded in the SolrParams.




Spring Data Solr - a small layer above solrj providing some fluent API and repository abstractions:
http://stackoverflow.com/questions/15307737/difference-between-solr-core-solrj-spring-data-solr
http://static.springsource.org/spring-data/data-solr/docs/current-SNAPSHOT/reference/htmlsingle/



Errors of solr:
1 too many boolean clauses
http://stackoverflow.com/questions/3802367/solr-search-for-lots-of-values


A example of custom SolrServerFactory:
http://sillycat.iteye.com/blog/1530915
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics