简体   繁体   中英

how to store base64 encoded string in solr and search using solr query with normal text

I am using solr for data indexing for storing some of field. I am using field as <field name="Content" type="string" indexed="true" stored="true" multiValued="true"/> , the data is in base64 encoded format.

For the field content I want to search in that data using keywords which are in plain text. By decoding base64 I can find that keyword in the content. (like elastic search with attachment field type where we have to pass base64 encoded data and we can search in that data)

I'm using query on solr browser but not able to find the result:

http://localhost:8983/solr/collection/select?q=Content%3A*English*&wt=json&indent=true

Solr does not know your content is base64. Furthermore, type=string is not tokenized.

So, you need to do some pre-processing. Probably as a custom element somewhere. If you just want to search the field, you probably don't need to store it (just index) and could have a custom UpdateRequestProcessor that does base64 decoding.

If you want to actually store the field, then the processing needs to happen as the very first step of the indexing pipeline. So you need a custom CharacterFilter before you do tokenization.

Unfortunately, neither component exists in the base distribution right now. You would have to code it in Java or - if you are using UpdateRequestProcessor - in Javascript .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM