Confluence 4.0 : Garbage Collector Performance Issues
This page last changed on Aug 23, 2011 by edawson.
Summary
See Configuring System Properties for how to add these properties to your environment. BackgroundPerformance problems in Confluence, and in rarer circumstances for JIRA, generally manifest themselves in either:
There are a wealth of simple tips and tricks that can be applied to Confluence, that can have a significantly tangible benefit to the long-term stability, performance and responsiveness of the application. On this page:
Why Bad Things HappenConfluence can be thought of like a gel or a glue, a tool for bringing things together. Multiple applications, data-types, social networks and business requirements can be efficiently amalgamated, leading to more effective collaboration. The real beauty of Confluence, however, is its agility to mould itself into your organizations' DNA - your existing business and cultural processes, rather than the other way around - your organization having to adapt to how the software product works. The flip side of this flexibility is having many competing demands placed on Confluence by its users. Historically, this is an extraordinarily broad and deep set of functions, that really, practically can't be predicted for individual use cases. The best mechanism to protect the installation is to place Confluence on a foundation where it is fundamentally more resilient and able to react and cope with competing user requirements. Appreciate how Confluence and the JAVA JVM use memoryThe Java memory model is naive. Compared to a unix process, which has four intensive decades of development built into time-slicing, inter-process communication and intelligent deadlock avoidance, the Java thread model really only has 10 years at best under its belt. As it is also an interpreted language, particular idiosyncrasies of the chosen platform Confluence is running can also influence how the JRE reacts. As a result it is sometimes necessary to tune the jvm parameters to give it a "hint" about how it should behave. There are circumstances whereby the Java JVM will take a mediocre option in respect to resource contention and allocation and struggle along with ofttimes highly impractical goals. For example, The JRE will be quite happy to perform at 5 or 10% of optimum capacity if it means overall application stability and integrity can be ensured. This often translates into periods of extreme sluggishness, which effectively means that the application isn't stable, and isn't integral (as it cannot be accessed). This is mainly because Java shouldn't make assumptions on what kind of runtime behavior an application needs, but it's plain to see that the charter is to assume 'business-as-usual' for a wide range of scenarios and really only react in the case of dire circumstances. Memory is contiguousThe Java memory model requires that memory be allocated in a contiguous block. This is because the heap has a number of side data structures which are indexed by a scaled offset (ie n*512 bytes) from the start of the heap. For example, updates to references on objects within the heap are tracked in these "side" data structures. Consider the differences between:
Allocated memory is fully backed, memory mapped physical allocation to the application. That application now owns that segment of memory. Reserved memory (the difference between Xms and Xmx) is memory which is reserved for use, but not physically mapped (or backed) by memory. This means that, for example, in the 4G address space of a 32bit system, the reserved memory segment can be used by other applications, but, because Java requires contiguous memory, if the reserved memory requested is occupied the OS must swap that memory out of the reserved space either to another non-used segment, or, more painfully, it must swap to disk. Permanent Generation memory is also contiguous. The net effect is even if the system has vast quantities of cumulative free memory, Confluence demands contiguous blocks, and consequently undesirable swapping may occur if segments of requested size do not exist. See Causes of OutOfMemoryErrors for more details. Please be sure to position Confluence within a server environment that can successfully complete competing requirements (operating system, contiguous memory, other applications, swap, and Confluence itself). Figure out which (default) collector implementation your vendor is usingDefault JVM Vendor implementations are subtly different, but in production can differ enormously. The Oracle JVM by default splits the heap into three spaces
Objects are central to the operation of Confluence. When a request is received, the Java runtime will create new objects to fulfill the request in the Eden Space. If, after some time, those objects are still required, they may be moved to the Tenured (Old) space. But, typically, the overwhelming majority of objects created die young, within the Eden space. These are objects like method local references within a while or for loop, or Iterators for scanning through Collections or Sets. But in IBM J9 the default policy is for a single, contiguous space - one large heap. The net effect is that for large Websphere environments, garbage collection can be terribly inefficient - and capable of suffering outages during peak periods.
By setting a larger Use the Parallel Garbage CollectorConfluence out of the box, and Oracle Java as default, uses the serial garbage collector on the Full Tenured heap. The Young space is collected in parallel, but the Tenured is not. This means that at a time of load if a full collection event occurs, since the event is a 'stop-the-world' serial event then all application threads other than the garbage collector thread are taken off the CPU. This can have severe consequences if requests continue to accrue during these 'outage' periods. As a rough guide, for every gigabyte of memory allocated allow a full second (exclusive) to collect. If we parallelize the collector on a multi-core/multi-cpu architecture instance, we not only reduce the total time of collection (down from whole seconds to fractions of a second) but we also improve the resiliency of the JRE in being able to recover from high-demand occasions. Additionally, Oracle provide a CMS, Concurrent Mark-Sweep Collector ( Restrict ability of Tomcat to 'cache' incoming requestsQuite often the fatal blow is swung by the 'backlog' of accumulated web requests whilst some critical resource (say the index) is held hostage by a temporary, expensive job. Even if the instance is busy garbage collecting due to load, Tomcat will still trigger new http requests and cache internally, as well as the operating system beneath which is also buffering incoming requests in the socket for Tomcat to pick up the next time it gets the CPU. <Connector port="8090" protocol="HTTP/1.1" maxHttpHeaderSize="8192" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" useBodyEncodingForURI="true" enableLookups="false" redirectPort="8443" acceptCount="100" connectionTimeout="20000" disableUploadTimeout="true"/> Here the Tomcat Connector is configured for 150 "maxThreads" with an "acceptCount" of 100. This means up to 150 threads will awaken to accept (but importantly not to complete) web requests during performance outages, and 100 will be cached in a queue for further processing when threads are available. That's 250 threads, many of which can be quite expensive in and of themselves. Java will attempt to juggle all these threads concurrently and become extremely inefficient at doing so, exasperating the garbage collection performance issue. Resolution: reduce the number of Disable remote (distributed) garbage collection by Java clientsMany clients integrate third-party or their own custom applications to interrogate, or add content to Confluence via its RPC interface. The Distributed Remote Garbage Collector in the client uses RMI to trigger a remote GC event in the Confluence server. Unfortunately, as of this writing, a This can be disabled by using the flag Virtual Machines are EvilVmware Virtual Machines, whilst being extremely convenient and fantastic, also cause particular problems for Java applications because it's very easy for host operating system resource constraints such as temporarily insolvent memory availability, or I/O swapping, to cascade into the Java VM and manifest as extremely unusual, frustrating and seemingly illogical problems. We already document some disk I/O metrics with VMware images. Although we now officially support the use of virtual instances we absolutely do not recommend them unless maintained correctly. This is not to say that vmware instances cannot be used, but, they must be used with due care, proper maintenance and configuration. Besides, if you are reading this document because of poor performance, the first action should be to remove any virtualization. Emulation will never beat the real thing and always introduces more black box variability into the system. Use Java 1.6Java 1.6 is generally regarded via public discussion to have an approximate 20% performance improvement over 1.5. Our own internal testing revealed this statistic to be credible. 1.6 is compatible for all supported versions of Confluence, and we strongly recommend that installations not using 1.6 should migrate. Use -server flagThe hotspot server JVM has specific code-path optimizations which yield an approximate 10% gain over the client version. Most installations should already have this selected by default, but it is still wise to force it with -server, especially on some Windows machines. If using 64bit JRE for larger heaps, use
|
![]() | Atlassian recommends at the very least to get VisualVM up and running (you willneed JMX), and to add Access and Garbage Collection logging. |
The JVM will generally only collect on the full heap when it has no other alternative, because of the relative size of the Tenured space (it is typically larger than the Young space), and the natural probability of objects within tenured not being eligible for collection, i.e. they are still alive.
Some installations can trundle along, only ever collecting in Young space. As time goes on, some object will survive the initial Young object collection and be promoted to Tenured. At some point, it will be dereferenced and no longer reachable by the deterministic, directed object graph. However, the occupied memory will still be held in limbo as "dead" memory until a collection occurs in the Tenured space to clear and compact the space.
It is not uncommon for moderately sized Confluence installations to reclaim as much as 50% of the current heap size on a full collection; This is because full collections occur so infrequently. By reducing the occupancy fraction heap trigger, this means that more memory will be available at any time, meaning that fewer swapping/object collections will occur during the busy hour.
Atlassian would classify frequency tuning on collections as an advanced topic for further experimentation, and is provided for informational purposes only. Unfortunately, it's impractical for Atlassian to support these kinds of changes in general.
Atlassian has a number of high profile and some extremely high demanding, mission-critical clients who have successfully, usually through trial and error, applied these recommendations to production instances and have significantly improved their instances. For more information, please file a support case at support.atlassian.com.
![]() |
Document generated by Confluence on Sep 19, 2011 02:42 |