简体繁体中英

Physical Memory Configuration

原文 2012-06-07 15:35:07 2 1 search/ ubuntu/ solr/ jvm

Our company has several different ways of getting leads, and several types of leads we deal with. There are only slight differences between each type of lead and much of the information is shared with or related to one or more other lead types. Me and my team are trying to build/configure an index using Solr that handles each of these lead types and all their shared data .. customer data, resort data. etc (around 1.2 million records in all). We're currently hosting an Ubuntu server (12G RAM, 8 core Opteron), running Tomcat 6 and Solr 3.4.

I'd like the index to add records in live time when a customer submits a lead-gen form on our website(around 1500-2000 daily), as well as update when employees add or modify data (around 2500-3000 times daily).

In addition I need customers on the website and employees in house to be able to quickly search this data with filters, facets, auto-completes, highlighting and all the stuff that one has come to expect from a well written search.

This setup is currently functioning, but often hangs updating records both on the website and in our internal apps. Commits are done every 1000 documents or 5 seconds and I optimize once daily. What are the optimal JVM, Server or Solr configurations for this type of setup? Any help would be appreciated and I can provide as much information as needed to anyone willing to help.

1 answers

First, you should not optimize .

There are two common erros when configuring the JVM heap size in Solr:

giving too much memory to the JVM, (the OS cache won't be able to cache disk operations),
giving not enough memory to the JVM (there will be a lot of pressure on the garbage collector which will be forced to run frequent stop-the-world collections, use JMX monitoring to figure out whether full GC get triggered).

One other reason why you application may hang is the background merges. Lucene is based on segments, and whenever the number of segments gets higher than mergeFactor , a merge is triggered. A low value of mergeFactor might explain the hangs.

You should give more details on your current setup so that we can help you:

JVM size,
what collector you are using (G1, throughput collector, concurrent low pause collector, ...)
index size (on disk, not the number of documents),
mergeFactor , ramBufferSizeMB , ...

Optimal Indexing strategy for Multilingual requirement using solr

solr configuration on jetty container

Solr Shard Configuration

SOLR IDF Max docs configuration

solr: What is the memory limit for loading data to solr?

run multiple 'big' Solr shard instances on one physical machine

SOLR - Missing configuration: Unsupported ContentType: text/html; Unsupported ContentType: application/pdf

Solr configuration to filter all docs with titles that exactly match the search query itself

Monotonicity and A*. Is it optimal?

Is this searching algorithm optimal?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Optimal Indexing strategy for Multilingual requirement using solr solr configuration on jetty container Solr Shard Configuration SOLR IDF Max docs configuration solr: What is the memory limit for loading data to solr? run multiple 'big' Solr shard instances on one physical machine SOLR - Missing configuration: Unsupported ContentType: text/html; Unsupported ContentType: application/pdf Solr configuration to filter all docs with titles that exactly match the search query itself Monotonicity and A*. Is it optimal? Is this searching algorithm optimal?

Related Tags

Optimal Solr JVM/Virtual/Physical Memory Configuration

Question

1 answers

solution1 4 2012-06-12 23:22:59

solution1
4 2012-06-12 23:22:59