我在生产机器上遇到了这个烦人的问题,我的芹菜容器有一个 docker 容器,配置如下:
worker:
build: .
env_file:
- .env
command: celery -A my_app worker --loglevel=info --concurrency 1 -E
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
depends_on:
- api
我的问题是,遗憾的是,这个工作人员经常会内存不足(尽管设置
worker_max_tasks_per_child = 1000
),并抛出此错误:
[2024-03-19 17:23:24,533: CRITICAL/MainProcess] Unrecoverable error: MemoryError()
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/celery/worker/worker.py", line 203, in start
File "/usr/local/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
File "/usr/local/lib/python3.10/site-packages/celery/bootsteps.py", line 365, in start
File "/usr/local/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 332, in start
File "/usr/local/lib/python3.10/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/usr/local/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 628, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.10/site-packages/celery/worker/loops.py", line 97, in asynloop
next(loop)
File "/usr/local/lib/python3.10/site-packages/kombu/asynchronous/hub.py", line 362, in create_loop
File "/usr/local/lib/python3.10/site-packages/kombu/transport/redis.py", line 1326, in on_readable
File "/usr/local/lib/python3.10/site-packages/kombu/transport/redis.py", line 562, in on_readable
File "/usr/local/lib/python3.10/site-packages/kombu/transport/redis.py", line 955, in _brpop_read
File "/usr/local/lib/python3.10/site-packages/redis/client.py", line 1275, in parse_response
File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 865, in read_response
File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 346, in read_response
File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 356, in _read_response
File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 259, in readline
File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 209, in _read_from_socket
MemoryError
现在对我来说真正的大问题是,docker 容器在崩溃后永远不会重新启动,我不知道为什么!我的 django 容器也出现内存错误,并且似乎可以正常重新启动(具有完全相同的
restart_policy
),但不是这个...
我尝试将容器的
mem_limit
设置为某个任意值(主机RAM的1/4),因为我读到MemoryError可能会在没有任何错误的情况下停止容器......(但正如我所说,我已经收到了我的 django 容器上出现 MemoryErrors,它重新启动得很好),但无济于事。
这实际上是 celery 的一个错误,将其升级到版本 5.3.6 解决了我的问题。