- 前言
Java Thread Dump 是一个非常有用的应用诊断工具, 通过thread dump出来的信息, 可以定位到你需要了解的线程, 以及这个线程的调用栈. 如果配合linux的top命令, 可以找到你的系统中的最耗CPU的线程代码段, 这样才能有针对性地进行优化.
- 场景和实践
2.1. 后台系统一直是在黑盒运行, 除了能暂停一部分任务的执行, 根本无法知道哪些任务耗CPU过多。所以一直以为是业务代码的问题, 经过各种优化(删减没必要的逻辑, 合并写操作)等等优化, 系统负载还是很高. 没什么访问量, 后台任务处理也就是每天几百万的级别, load还是达到了15以上. CPU只有4核,天天收到load告警却无从下手, 于是乎就被迫来分析一把线程.
2.2 系统跑的是java tomcat, 要触发tomcat thread dump很简单, 先找到tomcat对应的进程id, 我们设置为PID
【linux 命令】: ps -ef | grep tomcat
可以找到, 然后给这个进程发送一个QUIT的信号量, 让其触发线程的dump, 下面的操作先别急着动手, 等到看完2.3再动手不迟
【linux 命令】: kill -3 $PID / kill -QUIT $PID
tomcat会把thread dump的内容输出到控制台
【linux 命令】:cd $tomcathome/logs/
查看 catalina.out 文件, 把最后的跟thread相关的内容获取出来.
大致内容如下:
2012 - 04 - 13 16 : 30 : 41 Full thread dump OpenJDK 64 -Bit Server VM ( 1 . 6 . 0 -b09 mixed mode): " TP-Processor12 " daemon prio= 10 tid=0x00000000045acc00 nid=0x7f19 in Object. wait () [0x00000000483d0000..0x00000000483d0a90] java.lang.Thread. State: WAITING (on object monitor ) at java.lang.Object. wait (Native Method) - waiting on <0x00002aaab5bfce70> (a org.apache.tomcat.util.threads.ThreadPool$ControlRunnable) at java.lang.Object. wait (Object. java: 502 ) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool. java: 662 ) - locked <0x00002aaab5bfce70> (a org.apache.tomcat.util.threads.ThreadPool$ControlRunnable) at java.lang.Thread.run(Thread. java: 636 ) " TP-Processor11 " daemon prio= 10 tid=0x00000000048e3c00 nid=0x7f18 in Object. wait () [0x00000000482cf000..0x00000000482cfd10] java.lang.Thread. State: WAITING (on object monitor ) .... " VM Thread " prio= 10 tid=0x00000000042ff400 nid=0x77de runnable " GC task thread#0 (ParallelGC) " prio= 10 tid=0x000000000429c400 nid=0x77d9 runnable " GC task thread#1 (ParallelGC) " prio= 10 tid=0x000000000429d800 nid=0x77da runnable " GC task thread#2 (ParallelGC) " prio= 10 tid=0x000000000429ec00 nid=0x77db runnable " GC task thread#3 (ParallelGC) " prio= 10 tid=0x00000000042a0000 nid=0x77dc runnable " VM Periodic Task Thread " prio= 10 tid=0x0000000004348400 nid=0x77e5 waiting on condition JNI global references: 815 Heap PSYoungGen total 320192K , used 178216K [0x00002aaadce00000, 0x00002aaaf1800000, 0x00002aaaf1800000) eden space 303744K , 55 % used [0x00002aaadce00000,0x00002aaae718e048,0x00002aaaef6a0000) from space 16448K , 65 % used [0x00002aaaf0690000,0x00002aaaf110c1b0,0x00002aaaf16a0000) to space 16320K , 0 % used [0x00002aaaef6a0000,0x00002aaaef6a0000,0x00002aaaf0690000) PSOldGen total 460992K , used 425946K [0x00002aaab3a00000, 0x00002aaacfc30000, 0x00002aaadce00000) object space 460992K , 92 % used [0x00002aaab3a00000,0x00002aaacd9f6a30,0x00002aaacfc30000) PSPermGen total 56192K , used 55353K [0x00002aaaae600000, 0x00002aaab1ce0000, 0x00002aaab3a00000) object space 56192K , 98 % used [0x00002aaaae600000,0x00002aaab1c0e520,0x00002aaab1ce0000)
最后一段是系统的对内存的使用情况.