我有一些在gevent上运行uwsgi的Web api服务器,在8台核心计算机上有16个进程。运行几天后,某些进程占用了100%的CPU。
这里是uwsgi的主要选项
master = true
processes = 16
enable-threads=false
min-worker-lifetime = 0
gevent-early-monkey-patch=true
gevent-monkey-patch=true
module = server:app
gevent = 60
因此,我使用pyflame来收集该过程的数据。pyflame result结果令人困惑。 77%的CPU处于空闲状态,其余的运行gevent.hub.Hub.run [行:loop.run()]。
然后,我了解到系统调用也被视为空闲。所以我跟踪PID,得到重复的结果
epoll_wait(20, {{EPOLLIN, {u32=36, u64=30017526431780}}}, 64, 748) = 1
clock_gettime(CLOCK_MONOTONIC, {40805477, 200961138}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 200981886}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201003573}) = 0
epoll_wait(20, {{EPOLLIN, {u32=36, u64=30017526431780}}}, 64, 748) = 1
clock_gettime(CLOCK_MONOTONIC, {40805477, 201073893}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201093352}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201123285}) = 0
epoll_wait(20, {{EPOLLIN, {u32=36, u64=30017526431780}}}, 64, 748) = 1
clock_gettime(CLOCK_MONOTONIC, {40805477, 201174492}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201192491}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201207349}) = 0
epoll_wait(20, {{EPOLLIN, {u32=36, u64=30017526431780}}}, 64, 748) = 1
clock_gettime(CLOCK_MONOTONIC, {40805477, 201259254}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201279465}) = 0
clock_gettime(CLOCK_MONOTONIC, {40805477, 201301453}) = 0
我发现u32 = 36句柄是与dns服务器的udp连接。我netstat并得到
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 472 0 local_ip:56273 remote_ip:domain ESTABLISHED 26223/./uwsgi
Recv-Q保持在472,这看起来很奇怪。
似乎该进程已循环调用epoll_wait,但没有来自Kenel缓冲区的令牌数据。
所以,接下来我应该检查什么?任何意见和建议将不胜感激。
嗯。看起来很奇怪,uwsgi是什么版本?