Chunkize warning while installing gensim
我在Python中安装了gensim(通过pip)。 安装结束后,我收到以下警告:
C:\Python27\lib\site-packages\gensim\utils.py:855: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
我怎么能纠正这个?
由于此警告,我无法从gensim.models导入word2vec。
我有以下配置:Python 2.7,gensim-0.13.4.1,numpy-1.11.3,scipy-0.18.1,pattern-2.6。
在导入gensim之前,您可以使用此代码禁止显示消息:
1 2 3 4 | import warnings warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim') import gensim |
我认为这不是一个大问题。 Gensim只是让你知道它会将chunkize别名化为不同的函数,因为你使用了特定的操作系统。
从gensim.utils查看此代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | if os.name == 'nt': logger.info("detected Windows; aliasing chunkize to chunkize_serial") def chunkize(corpus, chunksize, maxsize=0, as_numpy=False): for chunk in chunkize_serial(corpus, chunksize, as_numpy=as_numpy): yield chunk else: def chunkize(corpus, chunksize, maxsize=0, as_numpy=False): """ Split a stream of values into smaller chunks. Each chunk is of length `chunksize`, except the last one which may be smaller. A once-only input stream (`corpus` from a generator) is ok, chunking is done efficiently via itertools. If `maxsize > 1`, don't wait idly in between successive chunk `yields`, but rather keep filling a short queue (of size at most `maxsize`) with forthcoming chunks in advance. This is realized by starting a separate process, and is meant to reduce I/O delays, which can be significant when `corpus` comes from a slow medium (like harddisk). If `maxsize==0`, don't fool around with parallelism and simply yield the chunksize via `chunkize_serial()` (no I/O optimizations). >>> for chunk in chunkize(range(10), 4): print(chunk) [0, 1, 2, 3] [4, 5, 6, 7] [8, 9] """ assert chunksize > 0 if maxsize > 0: q = multiprocessing.Queue(maxsize=maxsize) worker = InputQueue(q, corpus, chunksize, maxsize=maxsize, as_numpy=as_numpy) worker.daemon = True worker.start() while True: chunk = [q.get(block=True)] if chunk[0] is None: break yield chunk.pop() else: for chunk in chunkize_serial(corpus, chunksize, as_numpy=as_numpy): yield chunk |