最近遇到tomcat出現大量CLOSE_WAIT的問題, 始終找不出原因
以下是TCP連線的步驟
CLIENT SERVER
1. ESTABLISHED ESTABLISHED
2. (Close)
FIN-WAIT-1 –> <FIN,ACK> –> CLOSE-WAIT
3. FIN-WAIT-2 <– <ACK> <– CLOSE-WAIT
4. (Close)
TIME-WAIT <– <FIN,ACK> <– LAST-ACK
5. TIME-WAIT –> <ACK> –> CLOSED
(2 MSL)
Server端出現CLOSE_WAIT表示Server"被動地"收到關閉連線通知,Server確實關閉socket連線後會發LAST_ACK回去給Client端
通常CLOSE_WAIT出現的時間很短暫,會出現大量CLOSE_WAIT有以下幾種可能:
1. CPU資源吃緊,該請求正排隊關閉連線:此為正常現象,通常一段時間後就會消失,建議換好一點的硬體設備
2. 後端code有無窮迴圈,Server無法回應該請求:基本上這種情況可以排除,通常這類有明顯bug的code很快就會被發現,除非是天兵工程師。可用jstack或visualvm等工具查看
3. 後端code的multi-thread發生deadlock,Server無法回應該請求:檢查程式有synchronized的地方,可用jstack或visualvm等工具查看
4. 後端code有直接對Remote端發TCP packet的地方沒正確關閉連線:當Server的後端code用client類別連到Remote端,連線逾時遭到Remote端斷線,此時被動關閉連線時,code若沒處理好關閉,就會造成本地端CLOSE_WAIT
5. 某一版Tomcat或JVM的bug造成無法關閉socket
查socket語法:ss -tulpn (一些fd資訊要用sudo才會出現)
[Description] :
ss : It is a command representing utility used to investigate sockets
-t : It is an additional parameter for the ‘ss’ command used to add filter for the output for displaying TCP sockets.
-u : It is an additional parameter for the ‘ss’ command used to add filter for the output for displaying UDP sockets.
-l : It is an additional parameter for the ‘ss’ command used to add filter for the output for displaying only listening sockets.
-p : It is an additional parameter for the ‘ss’ command used to add filter for the output for displaying process associated with the sockets displayed.
-n : It is an additional parameter for the ‘ss’ command used to add filter for the output in a numeric format.
(拿到pid後)
查看socket local port:
root@hostname: /proc/7112/fd# ls -al | grep socket
查看socket local port的另一個方法:
root@hostname:/proc/7112/fd# lsof -i -a -p 7112
參考文章:http://www.dark-hamster.com/operating-system/linux/ubuntu/show-list-of-listening-services-in-linux-using-ss/
—-
問題發生原因:
http://jschu.blog.51cto.com/5594807/1732414
http://m.myexception.cn/open-source/921974.html
http://serverfault.com/questions/160558/how-to-not-get-so-many-apache-close-wait-connections
http://ahuaxuan.iteye.com/blog/657511
暴力刪除法:
https://github.com/rghose/kill-close-wait-connections
https://www.experts-exchange.com/questions/20568402/How-to-clear-CLOSE-WAIT-state-of-a-TCP-connection.html
http://www.shellhacks.com/en/HowTo-Kill-TCP-Connections-in-CLOSEWAIT-State