简体   繁体   中英

About PATH in FTS alfresco queries

I'm using Alfresco 4.1.6 and SOLR 1.4.

For search, I use fts_alfresco_language and the searchService.query method.

And in my query I search by PATH, TYPE and some custom properties like direction, telephone, mail, or similar.

I have now over 2 millions of documents, and we can see how the performance of the searchs are worst than at the beginning.

I read that in version 1.4 of solr, using PATH on the query is a bad idea. And is better avoid it and only use TYPE and the property key and value.

But I have 2 questions...

  1. Why the PATH increase the response time? It's not a help? I have over 1000 main folders at the root of the repository. If I specify the folder that solr may search, why this not filter the results and give me a worst time response than if I don't specify this? Or there are another way to say to solr the main folder to reduce results and then do the rest of the query?

  2. When I find by custom properties, I use 3 or 4 properties, all indexed, to search. These merged lookups has a higher overhead than one? Maybe is better to search only by one property, and not by the 3? Or maybe use ORs and not ANDs to quickly results? How works SOLR?

Thanks!

First let me start with this, I'm not sure what you want of this question cause it's vague. You're not asking how to make your query better, your asking why a bad-practice(bad-performance) is working bad for you.

Do some research on how to structure your ECM system, first thing what makes your ECM any good is a proper Content Model. There are books out there which will help you.

If you're structuring your content with folders (Path) and these are important for you, than you need to add these as metadata to your content. If you haven't done that, then you should start with that.

A good Content Model will be able to find content wherever it's placed within your ECM system. Sure it's easy to migrate a filesystem to an ECM system and just leave it there, but you've done only half the work.

The path queries are slow in general cause it uses a loop pattern and it's expensive. It has been greatly improved in the new SOLR, but it still isn't as fast as normal metadata querying.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM