简体   繁体   English

SOLR不区分大小写的字段搜索问题

[英]SOLR Case Insensitive Field search issue

What I want to achieve is when I search after test to bring me also Test, TeSt, TesT,TEST with case insensitive search. 我想要实现的是在testtest搜索,使我也可以使用不区分大小写的搜索来Test, TeSt, TesT,TEST What should I do ? 我该怎么办 ?

I have this textgen type in my schema.xml that is assigned to test_field 我在分配给test_field schema.xml中具有这种textgen类型

<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="select">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

Here are the results which I want to recieve with my query. 这是我想从查询中收到的结果。

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "q":"test_field:*",
      "indent":"true",
      "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
      {
        "id":"change.me",
        "test_field":["test"],
        "_version_":1546932094148542464},
      {
        "id":"change.me1",
        "test_field":["tesT"],
        "_version_":1546932100203020288},
      {
        "id":"change.me2",
        "test_field":["TesT"],
        "_version_":1546932103122255872},
      {
        "id":"change.me3",
        "test_field":["TEsT"],
        "_version_":1546932107768496128},
      {
        "id":"change.me4",
        "test_field":["TEST"],
        "_version_":1546932111283322880}]
  }}

When I use this query it does not give any result because it is case sensitive, even though it has the filter LowerCaseFilterFactory 当我使用此查询时,它不会区分大小写,因为它具有区分大小写的内容,即使它具有过滤器LowerCaseFilterFactory

http://localhost:8983/solr/test-data/select?q=test_field:*test*&wt=json&indent=true

AND the empty results. 和空结果。 (what I'm doing wrong?) (我做错了什么?)

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "q":"test_field:*test*",
      "indent":"true",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"change.me",
        "test_field":["test"],
        "_version_":1546932094148542464}]
  }}

Are you actually putting stars (wildcards) at both ends of your search term? 您实际上是否在搜索字词的两端都加了星号(通配符)? You should not need to do that. 您不需要这样做。 The whole point of Solr configuration is to tokenize your text in a way that you can just search for words without wildcards. Solr配置的全部要点是以一种您可以只搜索没有通配符的单词的方式来标记文本。

If you just search for a work in your text, it should work including mixed-case matching. 如果仅在文本中搜索作品,则该作品应该可以使用,包括大小写混合。 If not, check that your field is actually mapped to the right type and that you reindexed. 如果不是,请检查您的字段是否实际映射到正确的类型以及是否已重新索引。 If still confused, Solr Admin UI has an analyze screen where you can select your fields (or your field types) and see how something is tokenized and how it is matched. 如果仍然感到困惑,Solr Admin UI会有一个分析屏幕,您可以在其中选择字段(或字段类型),并查看如何标记某些内容以及如何对其进行匹配。 You can experiment there. 您可以在那里尝试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM