简体   繁体   中英

How to Optimize JVM & GC through Load Testing

Edit : Of the several extremely generous and helpful responses this question has already received, it is obvious to me that I didn't make an important part of this question clear when I asked it earlier this morning. The answers I've received so far are more about optimizing applications & removing bottlenecks at the code level. I am aware that this is way more important than trying to get an extra 3- or 5% out of your JVM!

This question assumes we've already done just about everything we could to optimize our application architecture at the code level. Now we want more, and the next place to look is at the JVM level and garbage collection; I've changed the question title accordingly. Thanks again!


We've got a "pipeline" style backend architecture where messages pass from one component to the next, with each component performing different processes at each step of the way.

Components live inside of WAR files deployed on Tomcat servers. Altogether we have about 20 components in the pipeline, living on 5 different Tomcat servers (I didn't choose the architecture or the distribution of WARs for each server). We use Apache Camel to create all the routes between the components, effectively forming the "connective tissue" of the pipeline.

I've been asked to optimize the GC and general performance of each server running a JVM (5 in all). I've spent several days now reading up on GC and performance tuning, and have a pretty good handle on what each of the different JVM options do, how the heap is organized, and how most of the options affect the overall performance of the JVM.

My thinking is that the best way to optimize each JVM is not to optimize it as a standalone. I "feel" (that's about as far as I can justify it!) that trying to optimize each JVM locally without considering how it will interact with the other JVMs on other servers (both upstream and downstream) will not produce a globally-optimized solution.

To me it makes sense to optimize the entire pipeline as a whole. So my first question is: does SO agree, and if not, why?

To do this, I was thinking about creating a LoadTester that would generate input and feed it to the first endpoint in the pipeline. This LoadTester might also have a separate " Monitor Thread " that would check the last endpoint for throughput. I could then do all sorts of processing where we check for average end-to-end travel time for messages, maximum throughput before faulting, etc.

The LoadTester would generate the same pattern of input messages over and over again. The variable in this experiment would be the JVM options passed to each Tomcat server's startup options. I have a list of about 20 different options I'd like to pass the JVMs, and figured I could just keep tweaking their values until I found near-optimal performance.

This may not be the absolute best way to do this, but it's the best way I could design with what time I've been given for this project (about a week).

Second question: what does SO think about this setup? How would SO create an "optimizing solution" any differently?

Last but not least, I'm curious as to what sort of metrics I could use as a basis of measure and comparison. I can really only think of:

  • Find the JVM option config that produces the fastest average end-to-end travel time for messages
  • Find the JVM option config that produces the largest volume throughput without crashing any of the servers

Any others? Any reasons why those 2 are bad?

After reviewing the play I could see how this might be construed as a monolithic question, but really what I'm asking is how SO would optimize JVMs running along a pipeline, and to feel free to cut-and-dice my solution however you like it.

Thanks in advance!

Let me go up a level and say I did something similar in a large C app many years ago. It consisted of a number of processes exchanging messages across interconnected hardware. I came up with a two-step approach.

Step 1. Within each process, I used this technique to get rid of any wasteful activities. That took a few days of sampling, revising code, and repeating. The idea is there is a chain, and the first thing to do is remove inefficiences from the links.

Step 2. This part is laborious but effective: Generate time-stamped logs of message traffic. Merge them together into a common timeline. Look carefully at specific message sequences. What you're looking for is

  1. Was the message necessary, or was it a retransmission resulting from a timeout or other avoidable reason?
  2. When was the message sent, received, and acted upon? If there is a significant delay between being received and acted upon, what is the reason for that delay? Was it just a matter of being "in line" behind another process that was doing I/O, for example? Could it have been fixed with different process priorities?

This activity took me about a day to generate logs, combine them, find a speedup opportunity, and revise code. At this rate, after about 10 working days, I had found/fixed a number of problems, and improved the speed dramatically .

What is common about these two steps is I'm not measuring or trying to get "statistics". If something is spending too much time, that very fact exposes it to a dilligent programmer taking a close meticulous look at what is happening.

I would start with finding the optimum recommended jvm values specified for your hardware/software mix OR just start with what is already out there.

Next I would make sure that I have monitoring in place to measure Business throughputs and SLAs

I would not try to tweak just the GC if there is no reason to.

First you will need to find what are the major bottlenecks in your application. If it is I/O bound, SQL bound etc.

Key here is to MEASURE, IDENTIFY TOP bottlenecks, FIX them and conduct another iteration with a repeatable load.

HTH...

The biggest trick I am aware of when running multiple JVMs on the same machine is limiting the number of core the GC will use. Otherwise what can happen when one JVM does a full GC is it will attempt to grab every core, impacting the performance of all the JVMs even though they are not performing a GC. One suggestion is to limit the number of gc threads to 5/8 or less. (I can't remember where it is written)


I think you should test the system as a whole to ensure you have realistic interaction between the services. However, I would assume you may need to tune each service differently.

Changing command line options is useful if you cannot change the code. However if you profile and optimise the code you can make far for difference than tuning the GC parameters (in which cause you need to change them again)

For this reason, I would only change the command line parameters as a last resort, after you there is little improvement which can be made in the code of the application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM