JVM调优的"标准参数"的各种陷阱[转]
(截取部分)
1、-XX:+DisableExplicitGC 与 NIO的direct memory
很多人都见过JVM调优建议里使用这个参数,对吧?但是为什么要用它,什么时候应该用而什么时候用了会掉坑里呢?
首先要了解的是这个参数的作用。在Oracle/Sun JDK这个具体实现上,System.gc()的默认效果是引发一次stop-the-world的full GC,对整个GC堆做收集。有几个参数可以改变默认行为,之前发过一帖简单描述过,这里就不重复了。关键点是,用了-XX:+DisableExplicitGC参数后,System.gc()的调用就会变成一个空调用,完全不会触发任何GC(但是“函数调用”本身的开销还是存在的哦~)。
为啥要用这个参数呢?最主要的原因是为了防止某些手贱的同学在代码里到处写System.gc()的调用而干扰了程序的正常运行吧。有些应用程序 本来可能正常跑一天也不会出一次full GC,但就是因为有人在代码里调用了System.gc()而不得不间歇性被暂停。也有些时候这些调用是在某些库或框架里写的,改不了它们的代码但又不想 被这些调用干扰也会用这参数。
OK。看起来这参数应该总是开着嘛。有啥坑呢?
其中一种情况是下述三个条件同时满足时会发生的:
1、应用本身在GC堆内的对象行为良好,正常情况下很久都不发生full GC;
2、应用大量使用了NIO的direct memory,经常、反复的申请DirectByteBuffer
3、使用了-XX:+DisableExplicitGC
能观察到的现象是:
import java.nio.*; public class DisableExplicitGCDemo { public static void main(String[] args) { for (int i = 0; i < 100000; i++) { ByteBuffer.allocateDirect(128); } System.out.println("Done"); } }
然后编译、运行之:
/** * General-purpose phantom-reference-based cleaners. * * <p> Cleaners are a lightweight and more robust alternative to finalization. * They are lightweight because they are not created by the VM and thus do not * require a JNI upcall to be created, and because their cleanup code is * invoked directly by the reference-handler thread rather than by the * finalizer thread. They are more robust because they use phantom references, * the weakest type of reference object, thereby avoiding the nasty ordering * problems inherent to finalization. * * <p> A cleaner tracks a referent object and encapsulates a thunk of arbitrary * cleanup code. Some time after the GC detects that a cleaner's referent has * become phantom-reachable, the reference-handler thread will run the cleaner. * Cleaners may also be invoked directly; they are thread safe and ensure that * they run their thunks at most once. * * <p> Cleaners are not a replacement for finalization. They should be used * only when the cleanup code is extremely simple and straightforward. * Nontrivial cleaners are inadvisable since they risk blocking the * reference-handler thread and delaying further cleanup and finalization. * * * @author Mark Reinhold * @version %I%, %E% */
重点是这两句:"A cleaner tracks a referent object and encapsulates a thunk of arbitrary cleanup code. Some time after the GC detects that a cleaner's referent has become phantom-reachable, the reference-handler thread will run the cleaner."
Oracle/Sun JDK 6中的HotSpot VM只会在old gen GC(full GC/major GC或者concurrent GC都算)的时候才会对old gen中的对象做reference processing,而在young GC/minor GC时只会对young gen里的对象做reference processing。
(死在young gen中的DirectByteBuffer对象会在young GC时被处理的例子,请参考这里:https://gist.github.com/1614952)
也就是说,做full GC的话会对old gen做reference processing,进而能触发Cleaner对已死的DirectByteBuffer对象做清理工作。而如果很长一段时间里没做过GC或者只做了 young GC的话则不会在old gen触发Cleaner的工作,那么就可能让本来已经死了的、但已经晋升到old gen的DirectByteBuffer关联的native memory得不到及时释放。
3、为DirectByteBuffer分配空间过程中会显式调用System.gc(),以期通过full GC来强迫已经无用的DirectByteBuffer对象释放掉它们关联的native memory:
// These methods should be called whenever direct memory is allocated or // freed. They allow the user to control the amount of direct memory // which a process may access. All sizes are specified in bytes. static void reserveMemory(long size) { synchronized (Bits.class) { if (!memoryLimitSet && VM.isBooted()) { maxMemory = VM.maxDirectMemory(); memoryLimitSet = true; } if (size <= maxMemory - reservedMemory) { reservedMemory += size; return; } } System.gc(); try { Thread.sleep(100); } catch (InterruptedException x) { // Restore interrupt status Thread.currentThread().interrupt(); } synchronized (Bits.class) { if (reservedMemory + size > maxMemory) throw new OutOfMemoryError("Direct buffer memory"); reservedMemory += size; } }
这几个实现特征使得Oracle/Sun JDK 6依赖于System.gc()触发GC来保证DirectByteMemory的清理工作能及时完成。如果打开了 -XX:+DisableExplicitGC,清理工作就可能得不到及时完成,于是就有机会见到direct memory的OOM,也就是上面的例子演示的情况。我们这边在实际生产环境中确实遇到过这样的问题。
教训是:如果你在使用Oracle/Sun JDK 6,应用里有任何地方用了direct memory,那么使用-XX:+DisableExplicitGC要小心。如果用了该参数而且遇到direct memory的OOM,可以尝试去掉该参数看是否能避开这种OOM。如果担心System.gc()调用造成full GC频繁,可以尝试下面提到 -XX:+ExplicitGCInvokesConcurrent 参数
======================================================================
2、-XX:+DisableExplicitGC 与 Remote Method Invocation (RMI) 与 -Dsun.rmi.dgc.{server|client}.gcInterval=
看了上一个例子有没有觉得-XX:+DisableExplicitGC参数用起来很危险?那干脆完全不要用这个参数吧。又有什么坑呢?
前段时间有个应用的开发来抱怨,说某次升级JDK之前那应用的GC状况都很好,很长时间都不会发生full GC,但升级后发现每一小时左右就会发生一次。经过对比发现,升级的同时也吧启动参数改了,把原本有的-XX:+DisableExplicitGC给去掉了。
观察到的日志有明显特征。一位同事表示:
who call system.gc :
sun.misc.GC$Daemon.run(GC.java:92)
预发机没什么流量,也会每一小时一次Full GC
频率正好是一小时一次
// A user-settable upper limit on the maximum amount of allocatable direct // buffer memory. This value may be changed during VM initialization if // "java" is launched with "-XX:MaxDirectMemorySize=<size>". // // The initial value of this field is arbitrary; during JRE initialization // it will be reset to the value specified on the command line, if any, // otherwise to Runtime.getRuntime().maxMemory(). // private static long directMemory = 64 * 1024 * 1024; // If this method is invoked during VM initialization, it initializes the // maximum amount of allocatable direct buffer memory (in bytes) from the // system property sun.nio.MaxDirectMemorySize. The system property will // be removed when it is accessed. // // If this method is invoked after the VM is booted, it returns the // maximum amount of allocatable direct buffer memory. // public static long maxDirectMemory() { if (booted) return directMemory; Properties p = System.getProperties(); String s = (String)p.remove("sun.nio.MaxDirectMemorySize"); System.setProperties(p); if (s != null) { if (s.equals("-1")) { // -XX:MaxDirectMemorySize not given, take default directMemory = Runtime.getRuntime().maxMemory(); } else { long l = Long.parseLong(s); if (l > -1) directMemory = l; } } return directMemory; }
(代码里原本的注释有个写错的地方,上面有修正)
当MaxDirectMemorySize参数没被显式设置时它的值就是-1,在Java类库初始化时maxDirectMemory()被java.lang.System的静态构造器调用,走的路径就是这条:
if (s.equals("-1")) { // -XX:MaxDirectMemorySize not given, take default directMemory = Runtime.getRuntime().maxMemory(); }
而Runtime.maxMemory()在HotSpot VM里的实现是:
public class Foo { private int value; public int getValue() { return this.value; } }
关键点是:
1、必须是成员方法;静态方法不行
2、返回值类型必须是引用类型或者int,其它都不算
3、方法体的代码必须满足aload_0; getfield #index; areturn或ireturn这样的模式。
留意:方法名是什么都没关系,是不是get、is、has开头都不重要。
那么这俩有啥问题?
取自JDK 6 update 27:
- product(bool, UseFastEmptyMethods, true, \
- "Use fast method entry code for empty methods") \
- \
- product(bool, UseFastAccessorMethods, true, \
- "Use fast method entry code for accessor methods") \
看到这俩参数的默认值都是true了么?也就是说,在Oracle/Sun JDK 6上设置这参数其实也是没意义的,跟默认一样,一直到最新的JDK 6 update 29都是如此。
不过在Oracle/Sun JDK 7里,情况有变化。
Bug ID: 6385687 UseFastEmptyMethods/UseFastAccessorMethods considered harmful
在上述bug对应的代码变更后,这俩参数的默认值改为了false。
本来想多写点这块的…算,还是长话短说。
Oracle JDK 7里的HotSpot VM已经开始有比较好的多层编译(tiered compilation)支持,可以预见在不久的将来该模式将成为HotSpot VM默认的执行模式。当前该模式尚未默认开启;可以通过 -XX:+TieredCompilation 来开启。
有趣的是,在使用多层编译模式时,如果UseFastAccessorMethods/UseFastEmptyMethods是开着的,有些多态方法调用点的性能反而会显著下降。所以,为了适应多层编译模式,JDK 7里这两个参数的默认值就被改为false了。
在邮件列表上有过相关讨论:review for 6385687: UseFastEmptyMethods/UseFastAccessorMethods considered harmful
======================================================================