简体繁体中英

Solr query - Is there a way to limit the size of a text field in the response

原文 2011-01-25 11:16:21 8 5 full-text-search/ solr

Is there a way to limit the amount of text in a text field from a query? Here's a quick scenario....

I have 2 fields:

docId - int
text - string.

I will query the docId field and want to get a "preview" text from the text field of 200 chars. On average, the text field has anything from 600-2000 chars but I only need a preview.

eg. [mySolrCore]/select?q=docId:123&fl=text

Is there any way to do it since I don't see the point of bringing back the entire text field if I only need a small preview?

I'm not looking at hit highlighting since i'm not searching for specific text within the Text field but if there is similar functionaly of the hl.fragsize parameter it would be great!

Hope someone can point me in the right direction!

Cheers!

5 answers

You would have to test the performance of this work-around versus just returning the entire field, but it might work for your situation. Basically, turn on highlighting on a field that won't match, and then use the alternate field to return the limited number of characters you want.

http://solr:8080/solr/select/?q=*:*&rows=10&fl=author,title&hl=true&hl.snippets=0&hl.fl=sku&hl.fragsize=0&hl.alternateField=description&hl.maxAlternateFieldLength=50

Notes:

Make sure your alternate field does not exist in the field list (fl) parameter
Make sure your highlighting field (hl.fl) does not actually contain the text you want to search

I find that the cpu cost of running the highlighter sometimes is more than the cpu cost and bandwidth of just returning the whole field. You'll have to experiment.

I decided to turn my comment into an answer.

I would suggest that you don't store your text data in Solr/Lucene. Only index the data for searching and store a unique ID or URL to identify the document. The contents of the document should be fetched from a separate storage system.

Solr/Lucene are optimized for searches. They aren't your data warehouse or database, and they shouldn't be used that way. When you store more data in Solr than necessary, you negatively impact your entire search system. You bloat the size of indices, increase replication time between masters and slaves, replicate data that you only need a single copy of, and waste cache memory on document caches that should be leveraged to make search faster.

So, I would suggest 2 things.

First, optimally, remove the text storage entire from your search index. Fetch the preview text and whole text from a secondary system that is optimized for holding documents, like a file server.

Second, sub-optimal, only store the preview text in your search index. Store the entire document elsewhere, like a file server.

My wish, which I suspect is shared by many sites, is to offer a snippet of text with each query response. That upgrades what the user sees from mere titles or equivalent. This is normal (see Google as an example) and productive technique. Presently we cannot easily cope with sending the entire content body from Solr/Lucene into a web presentation program and create the snippet there, together with many others in a set of responses as that is a significant network, CPU, and memory hog (think of dealing with many multi-MB files).

The sensible thing is for Solr/Lucene to have a control for sending only the first N bytes of content upon request, thereby saving a lot of trouble in the field. Kludges with hightlights and so forth are just that, and interfere with proper usage. We keep in mind that mechanisms feeding material into Solr/ucene may not be parsing the files, so those feeders can't create the snippets.

您可以添加一个额外的字段，例如excerpt / summary，它包含文本中的前200个字符，然后返回该字段

Linkedin real time search http://snaprojects.jira.com/browse/ZOIE

For storing big data http://project-voldemort.com/

Query multivalued field in solr

Solr Query on Unique Integer Field

Solr fails to index text field

Slow Solr query response time for long queries

how to limit the number of documents by different field value in solr

How efficient is a Solr / Lucene on huge sorted query with limit clause

Free text (natural language) query parsing with solr

SOLR - Use single text field in schema for full text search

Solr - how to “group by” and “limit”?

Exact Pharse Match in Solr with single/multi words for text field

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Query multivalued field in solr Solr Query on Unique Integer Field Solr fails to index text field Slow Solr query response time for long queries how to limit the number of documents by different field value in solr How efficient is a Solr / Lucene on huge sorted query with limit clause Free text (natural language) query parsing with solr SOLR - Use single text field in schema for full text search Solr - how to “group by” and “limit”? Exact Pharse Match in Solr with single/multi words for text field

Related Tags

Solr query - Is there a way to limit the size of a text field in the response

Question

5 answers

solution1
4 ACCPTED 2011-01-28 19:41:33

solution2
3 2011-01-25 19:36:28

solution3
0 2017-01-28 15:38:17

solution4
0 2011-01-25 11:22:34

solution5
-2 2011-01-25 19:18:42

Solr query - Is there a way to limit the size of a text field in the response

Question

5 answers

solution1 4 ACCPTED 2011-01-28 19:41:33

solution2 3 2011-01-25 19:36:28

solution3 0 2017-01-28 15:38:17

solution4 0 2011-01-25 11:22:34

solution5 -2 2011-01-25 19:18:42

solution1
4 ACCPTED 2011-01-28 19:41:33

solution2
3 2011-01-25 19:36:28

solution3
0 2017-01-28 15:38:17

solution4
0 2011-01-25 11:22:34

solution5
-2 2011-01-25 19:18:42