简体   繁体   中英

DynamicFields in Solr

In my current project I need to index all e-mails and their attachments from multiple mailboxes.

I will use Solr, but I don't know what is the best approach to build my index's structure. My first approach was:

<fields>
<field name="id" require="true"/>
<field name="uid" require="true"/>
//A lot of other fields
<dynamicField name="attachmentName_*" require="false">
<dynamicField name="attachmentBody_*" require="false">
</fields>

But now I am not really sure if it is the best structure. I don't think I can search for one term (eg stackoverflow ) and know where the term was (eg attachmentBody_1 or _2 or _3 etc) with a single query.

Anyone have a better suggestion to my index's structure?

You can use multiValued fields for attachmentName and attachmentBody. So you would have 2 regular fields instead of dynamic fields. You can then use highlighting to bring back the specific values that match with surrounding context.

Another option would be to make each attachment a separate document, and store something to identify which email it belongs to. The downside of this approach is that you may need to index any data from the email itself several times. But this is really only a problem if most of the email messages have more than one attachment.

I found one possible solution. All I need to do is set attachmentBody as stored.

This solution is not good enough because the index's space will dramatically increase but in my case there is no problem cause I will implement highlight feature too and those fields need to be stored.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM