- 前言
Java Thread Dump 是一个非常有用的应用诊断工具, 通过thread dump出来的信息, 可以定位到你需要了解的线程, 以及这个线程的调用栈. 如果配合linux的top命令, 可以找到你的系统中的最耗CPU的线程代码段, 这样才能有针对性地进行优化.
- 场景和实践
2.1. 后台系统一直是在黑盒运行, 除了能暂停一部分任务的执行, 根本无法知道哪些任务耗CPU过多。所以一直以为是业务代码的问题, 经过各种优化(删减没必要的逻辑, 合并写操作)等等优化, 系统负载还是很高. 没什么访问量, 后台任务处理也就是每天几百万的级别, load还是达到了15以上. CPU只有4核,天天收到load告警却无从下手, 于是乎就被迫来分析一把线程.
2.2 系统跑的是java tomcat, 要触发tomcat thread dump很简单, 先找到tomcat对应的进程id, 我们设置为PID
【linux 命令】: ps -ef | grep tomcat
可以找到, 然后给这个进程发送一个QUIT的信号量, 让其触发线程的dump, 下面的操作先别急着动手, 等到看完2.3再动手不迟
【linux 命令】: kill -3 $PID / kill -QUIT $PID
tomcat会把thread dump的内容输出到控制台
【linux 命令】:cd $tomcathome/logs/
查看 catalina.out 文件, 把最后的跟thread相关的内容获取出来.
大致内容如下:
2012
-
04
-
13
16
:
30
:
41
Full thread dump OpenJDK
64
-Bit Server VM (
1
.
6
.
0
-b09 mixed mode):
"
TP-Processor12
"
daemon prio=
10
tid=0x00000000045acc00 nid=0x7f19
in
Object.
wait
() [0x00000000483d0000..0x00000000483d0a90]
java.lang.Thread.
State:
WAITING (on object
monitor
)
at java.lang.Object.
wait
(Native Method)
- waiting on <0x00002aaab5bfce70> (a org.apache.tomcat.util.threads.ThreadPool$ControlRunnable)
at java.lang.Object.
wait
(Object.
java:
502
)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.
java:
662
)
- locked <0x00002aaab5bfce70> (a org.apache.tomcat.util.threads.ThreadPool$ControlRunnable)
at java.lang.Thread.run(Thread.
java:
636
)
"
TP-Processor11
"
daemon prio=
10
tid=0x00000000048e3c00 nid=0x7f18
in
Object.
wait
() [0x00000000482cf000..0x00000000482cfd10]
java.lang.Thread.
State:
WAITING (on object
monitor
)
....
"
VM Thread
"
prio=
10
tid=0x00000000042ff400 nid=0x77de runnable
"
GC task thread#0 (ParallelGC)
"
prio=
10
tid=0x000000000429c400 nid=0x77d9 runnable
"
GC task thread#1 (ParallelGC)
"
prio=
10
tid=0x000000000429d800 nid=0x77da runnable
"
GC task thread#2 (ParallelGC)
"
prio=
10
tid=0x000000000429ec00 nid=0x77db runnable
"
GC task thread#3 (ParallelGC)
"
prio=
10
tid=0x00000000042a0000 nid=0x77dc runnable
"
VM Periodic Task Thread
"
prio=
10
tid=0x0000000004348400 nid=0x77e5 waiting on condition
JNI global
references:
815
Heap
PSYoungGen total
320192K
, used
178216K
[0x00002aaadce00000, 0x00002aaaf1800000, 0x00002aaaf1800000)
eden space
303744K
,
55
% used [0x00002aaadce00000,0x00002aaae718e048,0x00002aaaef6a0000)
from space
16448K
,
65
% used [0x00002aaaf0690000,0x00002aaaf110c1b0,0x00002aaaf16a0000)
to space
16320K
,
0
% used [0x00002aaaef6a0000,0x00002aaaef6a0000,0x00002aaaf0690000)
PSOldGen total
460992K
, used
425946K
[0x00002aaab3a00000, 0x00002aaacfc30000, 0x00002aaadce00000)
object space
460992K
,
92
% used [0x00002aaab3a00000,0x00002aaacd9f6a30,0x00002aaacfc30000)
PSPermGen total
56192K
, used
55353K
[0x00002aaaae600000, 0x00002aaab1ce0000, 0x00002aaab3a00000)
object space
56192K
,
98
% used [0x00002aaaae600000,0x00002aaab1c0e520,0x00002aaab1ce0000)
最后一段是系统的对内存的使用情况.

