scikit - random forest regressor - AttributeError: 'Thread' object has no attribute '_children'
为随机林回归器设置n_jobs参数>1时,出现以下错误。如果我将n_jobs设置为1,一切都会正常工作。
attributeError:"thread"对象没有属性"_children"
我在烧瓶服务中运行此代码。有趣的是,当运行在烧瓶服务外部时,不会发生这种情况。我只在一个新安装的Ubuntu盒子上重新打印了这个。在我的Mac上,它工作得很好。
这是一个讨论这个的线程,但似乎没有经过任何工作区。"thread"对象没有属性"_children"-django+scikit learn
有什么想法吗?
谢谢大家!
这是我的测试代码:
1 2 3 4 5 6 7 8 9 10 11 12 | @test.route('/testfun') def testfun(): from sklearn.ensemble import RandomForestRegressor import numpy as np train_data = np.array([[1,2,3], [2,1,3]]) target_data = np.array([1,1]) model = RandomForestRegressor(n_jobs=2) model.fit(train_data, target_data) return"yey" |
Stacktrace:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | Traceback (most recent call last): File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1836, in __call__ return self.wsgi_app(environ, start_response) File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1820, in wsgi_app response = self.make_response(self.handle_exception(e)) File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1403, in handle_exception reraise(exc_type, exc_value, tb) File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in wsgi_app response = self.full_dispatch_request() File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in full_dispatch_request rv = self.handle_user_exception(e) File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in handle_user_exception reraise(exc_type, exc_value, tb) File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in full_dispatch_request rv = self.dispatch_request() File"/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File"/home/vagrant/flask.global-relevance-engine/global_relevance_engine/routes/test.py", line 47, in testfun model.fit(train_data, target_data) File"/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/forest.py", line 273, in fit for i, t in enumerate(trees)) File"/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 574, in __call__ self._pool = ThreadPool(n_jobs) File"/usr/lib/python2.7/multiprocessing/pool.py", line 685, in __init__ Pool.__init__(self, processes, initializer, initargs) File"/usr/lib/python2.7/multiprocessing/pool.py", line 136, in __init__ self._repopulate_pool() File"/usr/lib/python2.7/multiprocessing/pool.py", line 199, in _repopulate_pool w.start() File"/usr/lib/python2.7/multiprocessing/dummy/__init__.py", line 73, in start self._parent._children[self] = None |
问题
这可能是由于在Python2.7.5和3.3.2之前存在的
请参阅注释以确认较新版本适用于OP。
方案B-修改如果您不能升级,但可以访问
1 2 | if hasattr(self._parent, '_children'): # add this line self._parent._children[self] = None # indent this existing line |
解决方案C-猴补丁
- 随机森林回归量
- 继承:ForestRegressor
- 继承:baseforest
- 创建于:sklearn.ensegle.forest
- 哪个导入:从sklearn.externals.joblib并行
- 从multiprocessing.pool导入threadpool
- 从multiprocessing.dummy导入和存储进程
- 已分配给:dummyprocess也在multiprocessing.dummy中
该链中存在的
1 2 3 4 5 6 7 8 9 10 11 12 | # Let's make it available in our namespace: from sklearn.ensemble import RandomForestRegressor from multiprocessing import dummy as __mp_dummy # Now we can define a replacement and patch DummyProcess: def __DummyProcess_start_patch(self): # pulled from an updated version of Python assert self._parent is __mp_dummy.current_process() # modified to avoid further imports self._start_called = True if hasattr(self._parent, '_children'): self._parent._children[self] = None __mp_dummy.threading.Thread.start(self) # modified to avoid further imports __mp_dummy.DummyProcess.start = __DummyProcess_start_patch |
除非我遗漏了一些东西,否则从现在开始,所有创建的DummyProcess实例都将被修补,因此不会发生该错误。
对于任何一个更广泛地使用sklearn的人来说,我认为你可以反过来实现这一点,让它适用于所有的sklearn,而不是专注于一个模块。在进行任何sklearn导入之前,您将希望导入
原始答案:
在我写评论时,我意识到我可能已经发现了您的问题——我认为您的flask环境使用的是旧版本的python。
原因是,在最新版本的python multiprocessing中,接收该错误的行受以下条件保护:
1 2 | if hasattr(self._parent, '_children'): self._parent._children[self] = None |
看起来这个bug是在python 2.7期间修复的(我认为是在2.7.5中修复的)。也许你的烧瓶是旧的2.7或2.6?
你能检查一下你的环境吗?如果您不能更新解释器,也许我们可以找到一种方法来实现monkey-patch多处理,以防止它崩溃。