简体   繁体   English

带有Lucene搜索的Cloudant无法按预期排序

[英]Cloudant With Lucene Search Fails To Sort As Expected

I am pretty new to Cloudant but have developed in SQL on DB2 for some time. 我对Cloudant相当陌生,但是已经在DB2上的SQL中开发了一段时间。 I am running into an issue where I *think I am using the Lucene query engine and Cloudant indexes to return results from my query. 我遇到一个问题,我*认为我正在使用Lucene查询引擎和Cloudant索引从查询中返回结果。 The query gets all the results I want however, they are not sorted correctly. 该查询获得了我想要的所有结果,但是它们未正确排序。 I am wanting to sort the results alphabetically based on the "officialName" field. 我想根据“ officialName”字段的字母顺序对结果进行排序。 Because we are only returning the first 21 out of n results (and then we have a js handler to call more results via paging) we cannot sort in the java side but must do so via Cloudant. 因为我们只返回n个结果中的前21个(然后我们有一个js处理程序通过分页调用更多结果),所以我们不能在Java端进行排序,而必须通过Cloudant进行排序。 Our application is running Java and executed using IBM's Bluemix and WebSphere Liberty Profile. 我们的应用程序正在运行Java,并使用IBM的Bluemix和WebSphere Liberty Profile执行。 I have packaged the cloudant-client-2.8.0.jar and cloudant-HTTP-2.8.0.jar files to access the Cloudant database. 我已经打包了cloudant-client-2.8.0.jar和cloudant-HTTP-2.8.0.jar文件来访问Cloudant数据库。 We have many queries that are working so the connection itself is fine. 我们有许多正在运行的查询,因此连接本身很好。

Here is the code that builds the Cloudant Client search object: 以下是构建Cloudant Client搜索对象的代码:

Search search = getCloudantDbForOurApp().search("bySearchPP-ddoc/bySearchPP-indx").includeDocs(true);
SearchResult<DeliverableDetails> result = search.sort(getSortJsonString(searchString)).querySearchResult(getSearchQuery(searchString), DeliverableDetails.class);

Here is the method getSortJsonString. 这是getSortJsonString方法。 It should be noted that the search string is typically NOT null. 应当注意,搜索字符串通常不为空。 I should also note that leaving in or taking out the -score attribute does effect the search but never achieves alpha sorted results. 我还应注意,保留或保留-score属性确实会影响搜索,但永远不会获得alpha排序的结果。

private String getSortJsonString(String searchString) {
    String sortJson;
    if (searchString != null && !searchString.isEmpty()) {
        sortJson = "[\"-<score>\",\"officialName<string>\"]";
    } else {
        sortJson = "\"officialName<string>\"";
    }
    return sortJson;
}

Here is the getSearchQuery method's relevant code for reference: 这是getSearchQuery方法的相关代码供参考:

...
query += "(";
query += "officialName:" + searchString + "^3";
query += " OR " + "deliverableName:" + searchString + "^3";
query += " OR " + "alias:" + searchString + "^3";
query += " OR " + "contact:" + searchString;
query += ")";
....
// The query will look like below, where<search_string> is some user inputted value
// (officialName:<search_string>*^3 OR deliverableName:<search_string>*^3 OR alias:<search_string>*^3 OR contact:<search_string>*)

I have setup a design doc and index using the Cloudant dashboard as follows: 我已经使用Cloudant仪表板设置了设计文档和索引,如下所示:

{
"_id": "_design/bySearchPP-ddoc",
"_rev": "4-a91fc4ddeccc998c58adb487a121c168",
"views": {},
"language": "javascript",
"indexes": {
  "bySearchPP-indx": {
    "analyzer": {
      "name": "perfield",
      "default": "standard",
      "fields": {
        "alias": "simple",
        "contact": "simple",
        "deploymentTarget": "keyword",
        "businessUnit": "keyword",
        "division": "keyword",
        "officialName": "simple",
        "deliverableName": "simple",
        "pid": "keyword"
      }
    },
    "index": "function(doc) {
              if (doc.docType === \"Page\") {
                index(\"officialName\", doc.officialName, {\"store\":true, \"boost\":4.0});
                index(\"deliverableName\", doc.deliverableName, {\"store\":true, \"boost\":3.0});
                if (doc.aliases) {
                  for (var i in doc.aliases) {
                    index(\"alias\", doc.aliases[i], {\"store\":true, \"boost\":2.0});
                  }
                }
                if (doc.allContacts) {
                  for (var j in doc.allContacts) {
                    index(\"contact\", doc.allContacts[j], {\"store\":true, \"boost\":0.5});
                    }
                }
                index(\"deploymentTarget\", doc.deploymentTarget, {\"store\":true});
                index(\"businessUnit\", doc.businessUnit, {\"store\":true});
                index(\"division\", doc.division, {\"store\":true});
                index(\"pid\", doc.pid.toLowerCase(), {\"store\":true});
             }
          }"
     }
   }
 }

I am not sure if the sort is working and just not working how I want it to or if I have misconfigured something. 我不确定排序是否正常,是否按我的意愿工作或配置错误。 Either way, any help would be greatly appreciated. 无论哪种方式,任何帮助将不胜感激。 -Doug -道格

Solved my own issue w/ help from comments above. 通过上面的评论解决了我自己的问题(带帮助)。 Apparently everything was setup correctly but once I debug per @markwatsonatx I could see the field I wanted wasn't being returned. 显然,所有设置都正确,但是一旦我通过@markwatsonatx调试,我可以看到我想要的字段没有被返回。 Did some digging online and apparently for sort the field must be both indexed and NOT tokenized. 是否在网上进行了一些挖掘,显然是为了进行排序,因此必须对该字段进行索引并且不对它们进行标记化。 Thus I checked my index and noticed that the filed was being analyzed by the Simple analyzer. 因此,我检查了索引,并注意到该文件正在由简单分析器进行分析。 Changed it to the Keyword and the sort works as expected. 将其更改为“关键字”,排序按预期工作。 Hoep this helps someone. ep这可以帮助某人。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM