关于python:Celery长时间不活动后意外关闭

Celery Closes Unexpectedly After Longer Inactivity

因此,我正在使用RabbitMQ Celery创建一个简单的RPC体系结构。我有一个RabbitMQ消息代理和一个运行Celery deamon的远程工作者。

还有第三台服务器,它公开了一个瘦的RESTful API。

接收到HTTP请求后,它将向远程工作程序发送任务,等待响应并返回响应。

这在大多数情况下都有效。但是我注意到,长时间不活动(例如5分钟没有收到请求)之后,芹菜工人的行为很奇怪。长时间不活动后收到的前3个任务返回此错误:

1
exchange.declare: connection closed unexpectedly

在执行了三个错误的任务后,它又可以工作了。如果长时间没有任务,则会发生相同的情况。有什么主意吗?

我为芹菜工人准备的初始化脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# description"Celery worker using sync broker"

console log

start on runlevel [2345]
stop on runlevel [!2345]

setuid richard
setgid richard

script
chdir /usr/local/myproject/myproject
exec /usr/local/myproject/venv/bin/celery worker -n celery_worker_deamon.%h -A proj.sync_celery -Q sync_queue -l info --autoscale=10,3 --autoreload --purge
end script

respawn

我的芹菜配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Synchronous blocking tasks
BROKER_URL_SYNC = 'amqp://guest:guest@localhost:5672//'
# Asynchronous non blocking tasks
BROKER_URL_ASYNC = 'amqp://guest:guest@localhost:5672//'

#: Only add pickle to this list if your broker is secured
#: from unwanted access (see userguide/security.html)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
CELERY_ENABLE_UTC = True
CELERY_BACKEND = 'amqp'

# http://docs.celeryproject.org/en/latest/userguide/tasks.html#disable-rate-limits-if-they-re-not-used
CELERY_DISABLE_RATE_LIMITS = True

# http://docs.celeryproject.org/en/latest/userguide/routing.html
CELERY_DEFAULT_QUEUE = 'sync_queue'
CELERY_DEFAULT_EXCHANGE ="tasks"
CELERY_DEFAULT_EXCHANGE_TYPE ="topic"
CELERY_DEFAULT_ROUTING_KEY ="sync_task.default"
CELERY_QUEUES = {
    'sync_queue': {
        'binding_key':'sync_task.#',
    },
    'async_queue': {
        'binding_key':'async_task.#',
    },
}

有什么想法吗?

编辑:

好吧,现在看来是随机发生的。我在RabbitMQ日志中注意到了这一点:

1
2
3
=WARNING REPORT==== 6-Jan-2014::17:31:54 ===
closing AMQP connection <0.295.0> (some_ip_address:36842 -> some_ip_address:5672):
connection_closed_abruptly


您的RabbitMQ服务器或Celery工人是否在负载均衡器后面?如果是,则在一段时间不活动之后,负载平衡器将关闭TCP连接。在这种情况下,您将必须从客户端(工作人员)端启用心跳。如果这样做,我不建议为此使用纯Python amqp库。而是将其替换为librabbitmq。


connection_closed_abruptly是在没有正确的AMQP关闭协议的情况下客户端断开连接时引起的:

channel.close(...)

Request a channel close.

This method indicates that the sender wants to close the channel.
This may be due to internal conditions (e.g. a forced shut-down) or due to
an error handling a specific method, i.e. an exception.
When a close is due to an exception, the sender provides the class and method id of
the method which caused the exception.

After sending this method, any received methods except Close and Close-OK MUST be discarded. The response to receiving a Close after sending Close must be to send Close-Ok.

channel.close-ok():

Confirm a channel close.

This method confirms a Channel.Close method and tells the recipient
that it is safe to release resources for the channel.

A peer that detects a socket closure without having received a
Channel.Close-Ok handshake method SHOULD log the error.

这是一个问题。

是否可以为BROKER_HEARTBEATBROKER_HEARTBEAT_CHECKRATE设置自定义配置并再次检查,例如:

1
2
BROKER_HEARTBEAT = 10
BROKER_HEARTBEAT_CHECKRATE = 2.0