芹菜吃掉内存。
我们在 Django REST 中使用 Celery,并以 Redis 作为代理。 Celery 用于发送回调,如果不成功,则重试发送回调(重试策略是尝试发送回调,尝试之间的超时时间呈指数增长,已删除)。
大约每 1m 40s,RAM 的使用就会增加 48mb,同时日志中会出现几秒钟的垃圾邮件:
celery_worker-1 | [2024-06-06 19:17:32,442: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[6135c471-7f7e-471c-9f05-ba126418c002] received
celery_worker-1 | [2024-06-06 19:17:32,444: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[6135c471-7f7e-471c-9f05-ba126418c002] received
celery_worker-1 | [2024-06-06 19:17:32,445: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[6135c471-7f7e-471c-9f05-ba126418c002] received
... more 14 times
celery_worker-1 | [2024-06-06 19:17:32,468: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[f4aaa14f-8bda-442a-8856-af40f1d68e6d] received
celery_worker-1 | [2024-06-06 19:17:32,469: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[f4aaa14f-8bda-442a-8856-af40f1d68e6d] received
celery_worker-1 | [2024-06-06 19:17:32,471: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[f4aaa14f-8bda-442a-8856-af40f1d68e6d] received
celery_worker-1 | [2024-06-06 19:17:32,472: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[f4aaa14f-8bda-442a-8856-af40f1d68e6d] received
... more 55 times and many more with different IDs
一段时间后,我们甚至在日志中得到了这个:
celery_worker-1 | [2024-06-07 17:32:56,200: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[bb4b48c5-92d1-4226-aadf-bfdc387d4baf] received
celery_worker-1 | [2024-06-07 17:32:56,200: WARNING/MainProcess] QoS: Disabled: prefetch_count exceeds 65535
新任务会立即执行,日志如下(没有关于
prefetch_count
的警告):
celery_worker-1 | [2024-06-07 16:31:28,595: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[7cdda9a2-ad10-434e-82c2-5232e64dc3b1] received
celery_worker-1 | [2024-06-07 16:31:28,736: INFO/ForkPoolWorker-72] core.requests.celery_tasks.send_callback_task[7cdda9a2-ad10-434e-82c2-5232e64dc3b1]: {'msg': 'Callback sent!'}
所以我猜测这些任务没有执行,因为带有这些任务 ID 的日志中不存在“已发送”和“失败”消息。
服务器重新启动后日志中(22 在测试服务器上,也许还有更多):
[2024-06-07 14:36:56,163: WARNING/MainProcess] Restoring 22 unacknowledged message(s)
[2024-06-07 14:41:37,392: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[6b56c3b0-0828-46d6-86aa-34d0747ac30b] received
[2024-06-07 14:41:37,392: DEBUG/MainProcess] basic.qos: prefetch_count->9
[2024-06-07 14:41:37,394: INFO/MainProcess] Task core.requests.celery_tasks.send_callback_task[6b56c3b0-0828-46d6-86aa-34d0747ac30b] received
[2024-06-07 14:41:37,394: DEBUG/MainProcess] basic.qos: prefetch_count->10
...
为了解决这个问题,我们已经尝试过
send_callback_task
代码:
@shared_task(bind=True)
def send_callback_task(self, url: 'str', data):
response = requests.post(url, json=data, timeout=5)
log_msg = {
"msg": "Callback sent!",
}
logger.info(msg=log_msg)
if response.status_code not in (200, 201, 202):
log_msg = {
"msg": "Callback failed!"
}
logger.info(msg=log_msg)
raise RequestException
使用
apply_async
触发新任务。
像这样运行芹菜工人:
celery -A project worker -l info
16核机器。
CELERY_WORKER_MAX_TASKS_PER_CHILD = 100
如果需要,我愿意提供更多信息。
我猜prefork 不能很好地处理 I/O 操作(在 Celery 任务中是一个 POST 请求)。
从
prefork
切换到eventlet
,似乎解决了问题:
pip install eventlet
celery -A project worker --loglevel=info -P eventlet