简体   繁体   中英

Optimal Solr JVM/Virtual/Physical Memory Configuration

Our company has several different ways of getting leads, and several types of leads we deal with. There are only slight differences between each type of lead and much of the information is shared with or related to one or more other lead types. Me and my team are trying to build/configure an index using Solr that handles each of these lead types and all their shared data .. customer data, resort data. etc (around 1.2 million records in all). We're currently hosting an Ubuntu server (12G RAM, 8 core Opteron), running Tomcat 6 and Solr 3.4.

I'd like the index to add records in live time when a customer submits a lead-gen form on our website(around 1500-2000 daily), as well as update when employees add or modify data (around 2500-3000 times daily).

In addition I need customers on the website and employees in house to be able to quickly search this data with filters, facets, auto-completes, highlighting and all the stuff that one has come to expect from a well written search.

This setup is currently functioning, but often hangs updating records both on the website and in our internal apps. Commits are done every 1000 documents or 5 seconds and I optimize once daily. What are the optimal JVM, Server or Solr configurations for this type of setup? Any help would be appreciated and I can provide as much information as needed to anyone willing to help.

First, you should not optimize .

There are two common erros when configuring the JVM heap size in Solr:

  • giving too much memory to the JVM, (the OS cache won't be able to cache disk operations),
  • giving not enough memory to the JVM (there will be a lot of pressure on the garbage collector which will be forced to run frequent stop-the-world collections, use JMX monitoring to figure out whether full GC get triggered).

One other reason why you application may hang is the background merges. Lucene is based on segments, and whenever the number of segments gets higher than mergeFactor , a merge is triggered. A low value of mergeFactor might explain the hangs.

You should give more details on your current setup so that we can help you:

  • JVM size,
  • what collector you are using (G1, throughput collector, concurrent low pause collector, ...)
  • index size (on disk, not the number of documents),
  • mergeFactor , ramBufferSizeMB , ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM