简体   繁体   English

lucene.net-搜索词太短?

[英]lucene.net - search term too short?

Hello i have an application with lucene, when i'm searching terms like "a", "a*", "an", "an*" ,... 您好,当我搜索“ a”,“ a *”,“ an”,“ an *”,...等术语时,我有一个使用lucene的应用程序。

throw an error: 抛出错误:

Ausnahmedetails: Lucene.Net.Search.BooleanQuery+TooManyClauses: Systemfehler Ausnahmedetails:Lucene.Net.Search.BooleanQuery + TooManyClauses:Systemfehler

Quellfehler: Quellfehler:

Zeile 130:            
Zeile 131:            Dim searcher As IndexSearcher = New IndexSearcher(rootpath + "\" + index_root) 'Suche auf diesem Verzeichnis
Zeile 132:            Dim hits As Hits = searcher.Search(query)
Zeile 133:   
Zeile 134: 

but terms that contains three or more letters don't throw an error. 但是包含三个或更多字母的术语不会引发错误。

i'm really confused about that. 我真的很困惑。

more code: 更多代码:

Public Sub lucene_search(ByVal strSuchbegriff As String)




        Dim parser As QueryParser
        Dim query As Query


        If (check_volltextsuche.Checked = True And check_dateinamensuche.Checked = False) Then

            parser = New QueryParser("bodytext", analyzer) 'bodytext=typfeld der durchsucht wird

            Try
                query = parser.Parse(strSuchbegriff)

            Catch

                meldung.Text = "Falsche Verwendung der Suchsyntax"
                query = parser.Parse("Suchsyntax")
                ItemsGrid.Visible = False
                myexception = True
            End Try


        ElseIf (check_volltextsuche.Checked = False And check_dateinamensuche.Checked = True) Then

            parser = New QueryParser("title", analyzer)

            Try
                query = parser.Parse(strSuchbegriff) '* um teile danach zu finden --> gesamten filename durchsuchen
            Catch

                meldung.Text = "Falsche Verwendung der Suchsyntax"
                query = parser.Parse("Suchsyntax")
                ItemsGrid.Visible = False
                myexception = True

            End Try


        Else



            parser = New MultiFieldQueryParser(New [String]() {"title", "bodytext"}, New StandardAnalyzer())


            Try
                query = parser.Parse(strSuchbegriff)

            Catch

                meldung.Text = "Falsche Verwendung der Suchsyntax"
                query = parser.Parse("Suchsyntax")
                ItemsGrid.Visible = False
                myexception = True
            End Try





        End If

        '################
        'Do the search ##
        '################

        If myexception = False Then




            Dim searcher As IndexSearcher = New IndexSearcher(rootpath + "\" + index_root) 'Suche auf diesem Verzeichnis
            Dim hits As Hits = searcher.Search(query) '<-- ERROR

thanks in advance :> 在此先感谢:>

Are you sure it's failing for just "a" as well? 您确定它也会因为“ a”而失败吗?

For the "a*" and "an*" cases, the reason this fails because Lucene turns that expression in a prefix search and basically turns it into a giant "OR" query with all of the terms defined in the index that start with "a" (or "an"). 对于“ a *”和“ an *”情况,之所以失败,是因为Lucene会在前缀搜索中转换该表达式,并且基本上将其转换为一个巨大的“ OR”查询,并使用在索引中定义的所有以“一个”(或“一个”)。 So if you have "aardvark", "antler", "animal", etc, then "a*" is the same as "aardvark OR antler OR animal OR ..." 因此,如果您有“土豚”,“鹿角”,“动物”等,则“ a *”与“土豚OR鹿角OR动物OR ...”相同

Luncene also has a limit on the number of terms you can combine in an "OR" query, by default it's quite small (because too many terms can severely affect performance) and if there are too many terms, it will throw the BooleanQuery+TooManyClauses exception that you found. Luncene对“ OR”查询中可以组合的术语数量也有限制,默认情况下,Luncene很小(因为过多的术语会严重影响性能),并且如果术语过多,则会抛出BooleanQuery+TooManyClauses发现的异常。

You'll probably find that a query like "x*" or "qr*" does not throw an exception: this is because you (likely) don't have many terms that start with "x" or "qr". 您可能会发现诸如“ x *”或“ qr *”之类的查询不会引发异常:这是因为(可能)您没有很多以“ x”或“ qr”开头的术语。

To fix the problem you have a couple of options: 要解决此问题,您有两种选择:

  1. Refine the query further, simply don't allow single-letter prefix queries 进一步优化查询,只是不允许单字母前缀查询
  2. Increase the maximum clause count by calling setMaxClauseCount first (I would try to avoid this, though, since it can affect performance as I said) 通过首先调用setMaxClauseCount来增加最大子句数(不过,我会尽量避免这种情况,因为它会影响我所说的性能)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM