简体   繁体   中英

solr filter query on document value

I'm looking for a solution where my very long query strings are returning a 414 http response. Some queries can reach up to 10,000 chars, I could look at changing how many chars apache/jetty allows, but I'd rather not allow my webserver to have anyone post 10,000 chars.

Is there a way in solr where I can save a large query string in a document and use it in a filtered query?

select?q=*:*&fq=id:123 - this would return a whole document, but is there a way to return the value of a field in document 123 in the query

The field queryValue in document with the id of 123 would be Intersects((LONGSTRING))

So is there a way to do something like select?q=*:*&fq=foo:{id:123.queryValue}

this would be the same as select?q=*:*&fq=foo:Intersects((LONGSTRING)) ?

Two possibilities:

Joining

You can use the Join query parser to fetch the result from one collection / core and use that to filter results in a different core, but there are several limitations that will be relevant when you're talking larger installations and data sizes. You'll have to experiment to see if this works for your use case.

The Join Query Parser

Hashing

As long as you're only doing exact matches, hash the string on the client side when indexing and when querying. Exactly how you do this will depend on your language of choice. For python you'd get the hash of the long string using hashlib , and by using sha256 , you'll get a resulting string that you can use for indexing and querying that's 64 bytes if you're using the hex form, 44 if you're using base64.

Example:

>>> import hashlib
>>> hashlib.sha256(b"long_query_string_here").hexdigest()
'19c9288c069c47667e2b33767c3973aefde5a2b52d477e183bb54b9330253f1e'

You would then store then 19c92... value in Solr, and do the same transformation when you have value you're querying after.

fq=hashed_id:19c9288c069c47667e2b33767c3973aefde5a2b52d477e183bb54b9330253f1e

There might be alternative methods to what you are looking for before doing literal solution you seek:

  1. You can POST query to Solr instead of using GET. There is no URL limit on that
  2. If you are sending a long list of ids and using OR construct, there are alternative query parsers to make it more efficient (eg TermsQueryParser )
  3. If you have constant (or semi-constant) query parameters, you could factor them out into defaults on request handlers (in solrconfig.xml). You can create as many request handlers as you want and defaults can be overriden, so this effectively allows you to pre-define classes/types of queries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM