Thread polling sqs and adding it to a python queue for processing dies
我有一个多线程代码 - 3个线程轮询来自SQS的数据并将其添加到python队列。 5个线程从python队列中获取消息,处理它们并将其发送到后端系统。
这是代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | python_queue = Queue.Queue() class GetDataFromSQS(threading.Thread): """Threaded Url Grab""" def __init__(self, python_queue): threading.Thread.__init__(self) self.python_queue = python_queue def run(self): while True: time.sleep(0.5) //sleep for a few secs before querying again try: msgs = sqs_queue.get_messages(10) if msgs == None: print"sqs is empty now"! for msg in msgs: #place each message block from sqs into python queue for processing self.python_queue.put(msg) print"Adding a new message to Queue. Queue size is now %d" % self.python_queue.qsize() #delete from sqs sqs_queue.delete_message(msg) except Exception as e: print"Exception in GetDataFromSQS ::" + e class ProcessSQSMsgs(threading.Thread): def __init__(self, python_queue): threading.Thread.__init__(self) self.python_queue = python_queue self.pool_manager = PoolManager(num_pools=6) def run(self): while True: #grabs the message to be parsed from sqs queue python_queue_msg = self.python_queue.get() try: processMsgAndSendToBackend(python_queue_msg, self.pool_manager) except Exception as e: print"Error parsing::" + e finally: self.python_queue.task_done() def processMsgAndSendToBackend(msg, pool_manager): if msg !="": ###### All the code related to processing the msg for individualValue in processedMsg: try: response = pool_manager.urlopen('POST', backend_endpoint, body=individualValue) if response == None: print"Error" else: response.release_conn() except Exception as e: print"Exception! Post data to backend:" + e def startMyPython(): #spawn a pool of threads, and pass them queue instance for i in range(3): sqsThread = GetDataFromSQS(python_queue) sqsThread.start() for j in range(5): parseThread = ProcessSQSMsgs(python_queue) #parseThread.setDaemon(True) parseThread.start() #wait on the queue until everything has been processed python_queue.join() # python_queue.close() -- should i do this? startMyPython() |
问题:
3个python worker每隔几天随机死亡(使用top -p -H监视),如果我终止进程并再次启动脚本,一切都会正常。 我怀疑那些消失的工作者是3个GetDataFromSQS线程。而且因为GetDataFromSQS死了,其他5个工作者虽然在运行时总是睡觉,因为python队列中没有数据。 我不确定我在这里做错了什么因为我对python很新,并且按照本教程创建了这个排队逻辑和线程 - http://www.ibm.com/developerworks/aix/library/au-threadingpython/
在此先感谢您的帮助。 希望我已经清楚地解释了我的问题。
线程挂起的问题与获取sqs队列的句柄有关。 我使用IAM来管理凭证,使用boto sdk连接到sqs。
此问题的根本原因是boto包正在从AWS读取auth的元数据,并且偶尔会失败。
修复是编辑boto配置,增加了对AWS执行auth调用的尝试。
[宝途]
metadata_service_num_attempts = 5
(https://groups.google.com/forum/#!topic/boto-users/1yX24WG3g1E)