NGINX 处于 D 状态，阻塞

centos7，4CPU 8G，nginx 时不时在 top 里出现 d 状态，查了下是 uninterruptible sleep，出现的时候 iftop 网络全部跌至低谷，io 飙升，nginx 不接受也不处理新请求，业务阻塞死，大概 1 秒以后恢复，期间 CPU 、内存都正常，看 IOPS 也很低才几百（通过阿里云的监控看的），nginx 使用了 cache，把 proxy_cache 相关的全部注释掉关闭 cache 后不再出现 d 状态，这到底是啥问题？服务器是阿里 ECS，CPU 和内存肯定够用，IO 也不高，下面是出现问题时的状态：

avg-cpu: %user %nice %system %iowait %steal %idle 10.61 0.00 12.12 25.25 0.00 52.02

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 51.00 217.00 163.00 1123.00 880.00 6212.00 11.03 11.63 9.05 2.46 10.00 0.41 53.20

avg-cpu: %user %nice %system %iowait %steal %idle 12.56 0.00 13.57 39.20 0.00 34.67

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 0.00 257.00 168.00 930.00 716.00 5100.00 10.59 38.18 34.77 5.36 40.08 0.51 56.20

avg-cpu: %user %nice %system %iowait %steal %idle 16.58 0.00 16.58 63.32 0.00 3.52

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 29.00 291.00 184.00 1635.00 1040.00 37052.00 41.88 56.54 31.08 7.70 33.71 0.47 86.00

iotop:

Total DISK READ : 625.15 K/s | Total DISK WRITE : 680.54 K/s Actual DISK READ: 625.15 K/s | Actual DISK WRITE: 31.36 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 3174 be/4 nginx 320.49 K/s 328.40 K/s 0.00 % 42.88 % nginx: worker process 4012 be/4 root 0.00 B/s 0.00 B/s 0.00 % 39.93 % [kworker/u4:3+flush-253:0] 3173 be/4 nginx 304.66 K/s 352.14 K/s 0.00 % 36.35 % nginx: worker process Total DISK READ : 1148.47 K/s | Total DISK WRITE : 1089.07 K/s Actual DISK READ: 1148.47 K/s | Actual DISK WRITE: 3.77 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 4012 be/4 root 0.00 B/s 0.00 B/s 0.00 % 41.17 % [kworker/u4:3-events_unbound] 3173 be/4 nginx 570.28 K/s 574.24 K/s 0.00 % 37.87 % nginx: worker process 3174 be/4 nginx 578.20 K/s 514.83 K/s 0.00 % 37.18 % nginx: worker process

下面是我的 NGINX 配置： worker_processes 2; worker_shutdown_timeout 180s;

error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid;

worker_rlimit_nofile 65535; events {

worker_connections 65535;

} http {

include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" $upstream_cache_status $request_time $upstream_response_time';

access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
gzip on;
gzip_types application/json text/plain application/x-javascript application/javascript text/javascript text/css application/xml text/xml;
gzip_min_length 1k;
proxy_buffering on;
proxy_buffer_size 32k;
proxy_buffers 8 512k;
server_tokens off;
underscores_in_headers on;
proxy_temp_path /tmp/cache_tmp;
proxy_cache_path /tmp/cache levels=1:2 keys_zone=cache1:100m inactive=7d max_size=20g;
upstream backend {

	server xxx:xx max_fails=3 weight=3 fail_timeout=60s;
}
server {

	listen 80;
	location / {

		proxy_pass http://backend;
		proxy_http_version 1.1;
		
		proxy_set_header Connection "";
		proxy_cache cache1;
		proxy_cache_convert_head off;
		proxy_cache_key $uri
		proxy_cache_revalidate on;
		proxy_cache_methods GET HEAD POST;
		
	}
}

}

9 条回复 • 2020-08-03 11:23:54 +08:00