Find nearest value in numpy array
是否有一种麻木的方法(例如函数)来查找数组中最近的值?
例子:
1 | np.find_nearest( array, value ) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | import numpy as np def find_nearest(array, value): array = np.asarray(array) idx = (np.abs(array - value)).argmin() return array[idx] array = np.random.random(10) print(array) # [ 0.21069679 0.61290182 0.63425412 0.84635244 0.91599191 0.00213826 # 0.17104965 0.56874386 0.57319379 0.28719469] value = 0.5 print(find_nearest(array, value)) # 0.568743859261 |
如果你的甚大阵列类is和is this is a多,更快的解决方案:P></
1 2 3 4 5 6 | def find_nearest(array,value): idx = np.searchsorted(array, value, side="left") if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])): return array[idx-1] else: return array[idx] |
这arrays规模甚大的。你可以easily modify the above the method to sort在如果你不能承担that is already数组类。这是小arrays overkill for this,but they get is盎司大快多了。P></
答案与slight modification,the above工厂与任意三维arrays of(1D,2D,3D,…)P></
1 2 3 4 | def find_nearest(a, a0): "Element in nd array `a` closest to the scalar value `a0`" idx = np.abs(a - a0).argmin() return a.flat[idx] |
现在,单身在线:as a writtenP></
1 | a.flat[np.abs(a - a0).argmin()] |
答案:如果一个summary of the then has a类
你应该做你的第一clarify近邻均值模式值。often the rian abscissa人想要一个区间,例如阵列0,0.7,2.1 = [答案]值= 1.95,idx=1,会好的。This is the suspect You need that的房子(otherwise the following can be modified with a跟文甚easily You find the statement盎司区间条件)。会做笔记,this is the Way to最优bisection(with which将提供它does not require一线知名numpy at all is using numpy布尔函数更快因为他们比我做作业)。我将提供给在这里时间人比较对其他用户模式。P></
bisection:P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | def bisection(array,value): '''Given an ``array`` , and given a ``value`` , returns an index j such that ``value`` is between array[j] and array[j+1]. ``array`` must be monotonic increasing. j=-1 or j=len(array) is returned to indicate that ``value`` is out of range below and above respectively.''' n = len(array) if (value < array[0]): return -1 elif (value > array[n-1]): return n jl = 0# Initialize lower ju = n-1# and upper limits. while (ju-jl > 1):# If we are not yet done, jm=(ju+jl) >> 1# compute a midpoint with a bitshift if (value >= array[jm]): jl=jm# and replace either the lower limit else: ju=jm# or the upper limit, as appropriate. # Repeat until the test condition is satisfied. if (value == array[0]):# edge cases at bottom return 0 elif (value == array[n-1]):# and top return n-1 else: return jl |
现在define the队列11 each other from the answers,他们返回安指数:P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | import math import numpy as np def find_nearest1(array,value): idx,val = min(enumerate(array), key=lambda x: abs(x[1]-value)) return idx def find_nearest2(array, values): indices = np.abs(np.subtract.outer(array, values)).argmin(0) return indices def find_nearest3(array, values): values = np.atleast_1d(values) indices = np.abs(np.int64(np.subtract.outer(array, values))).argmin(0) out = array[indices] return indices def find_nearest4(array,value): idx = (np.abs(array-value)).argmin() return idx def find_nearest5(array, value): idx_sorted = np.argsort(array) sorted_array = np.array(array[idx_sorted]) idx = np.searchsorted(sorted_array, value, side="left") if idx >= len(array): idx_nearest = idx_sorted[len(array)-1] elif idx == 0: idx_nearest = idx_sorted[0] else: if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]): idx_nearest = idx_sorted[idx-1] else: idx_nearest = idx_sorted[idx] return idx_nearest def find_nearest6(array,value): xi = np.argmin(np.abs(np.ceil(array[None].T - value)),axis=0) return xi |
现在我的时间码:the已知的方法均不正确给the区间。方法对4轮的近邻点阵列(例如,1.5—> > = 2),和5例(例如法总是发→2号)。方法和存储器3,6,和bisection of the properly给原来的区间。P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | array = np.arange(100000) val = array[50000]+0.55 print( bisection(array,val)) %timeit bisection(array,val) print( find_nearest1(array,val)) %timeit find_nearest1(array,val) print( find_nearest2(array,val)) %timeit find_nearest2(array,val) print( find_nearest3(array,val)) %timeit find_nearest3(array,val) print( find_nearest4(array,val)) %timeit find_nearest4(array,val) print( find_nearest5(array,val)) %timeit find_nearest5(array,val) print( find_nearest6(array,val)) %timeit find_nearest6(array,val) (50000, 50000) 100000 loops, best of 3: 4.4 μs per loop 50001 1 loop, best of 3: 180 ms per loop 50001 1000 loops, best of 3: 267 μs per loop [50000] 1000 loops, best of 3: 390 μs per loop 50001 1000 loops, best of 3: 259 μs per loop 50001 1000 loops, best of 3: 1.21 ms per loop [50000] 1000 loops, best of 3: 746 μs per loop |
在大型阵列的bisection for next to 180us给4us compared最佳和最长(100~1000 1.21ms时代更快)。这是arrays for ~ 2小时报- 100更快。P></
to find the延伸是安安最近邻矢量在矢量阵)。P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | import numpy as np def find_nearest_vector(array, value): idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin() return array[idx] A = np.random.random((10,2))*100 """ A = array([[ 34.19762933, 43.14534123], [ 48.79558706, 47.79243283], [ 38.42774411, 84.87155478], [ 63.64371943, 50.7722317 ], [ 73.56362857, 27.87895698], [ 96.67790593, 77.76150486], [ 68.86202147, 21.38735169], [ 5.21796467, 59.17051276], [ 82.92389467, 99.90387851], [ 6.76626539, 30.50661753]])""" pt = [6, 30] print find_nearest_vector(A,pt) # array([ 6.76626539, 30.50661753]) |
如果你不想使用numpy恩:这会给你P></
1 2 3 4 | def find_nearest(array, value): n = [abs(i-value) for i in array] idx = n.index(min(n)) return array[idx] |
handle that will版本是在非标量阵列:"值"P></
1 2 3 4 5 | import numpy as np def find_nearest(array, values): indices = np.abs(np.subtract.outer(array, values)).argmin(0) return array[indices] |
现在版本(例如,int型的返回值,在输入是标量浮法)if the:P></
1 2 3 4 5 | def find_nearest(array, values): values = np.atleast_1d(values) indices = np.abs(np.subtract.outer(array, values)).argmin(0) out = array[indices] return out if len(out) > 1 else out[0] |
SciPy for version with在这里是"阿里"to find the onasafari,回答近邻矢量在矢量阵of an"P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | In [1]: from scipy import spatial In [2]: import numpy as np In [3]: A = np.random.random((10,2))*100 In [4]: A Out[4]: array([[ 68.83402637, 38.07632221], [ 76.84704074, 24.9395109 ], [ 16.26715795, 98.52763827], [ 70.99411985, 67.31740151], [ 71.72452181, 24.13516764], [ 17.22707611, 20.65425362], [ 43.85122458, 21.50624882], [ 76.71987125, 44.95031274], [ 63.77341073, 78.87417774], [ 8.45828909, 30.18426696]]) In [5]: pt = [6, 30] # <-- the point to find In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point Out[6]: array([ 8.45828909, 30.18426696]) #how it works! In [7]: distance,index = spatial.KDTree(A).query(pt) In [8]: distance # <-- The distances to the nearest neighbors Out[8]: 2.4651855048258393 In [9]: index # <-- The locations of the neighbors Out[9]: 9 #then In [10]: A[index] Out[10]: array([ 8.45828909, 30.18426696]) |
大arrays for the(优秀),demitri is given by"的答案回答marked as the currently更快比最佳。我在适应他的精确算法下面两种:P></
whether or not below the function工厂输入阵列is the类。P></
the function below the index of the归来的对应输入阵列closest which is to the value,通用somewhat黑莓。P></
注that the function below,特异性也会在房屋边把手铅to a written by the original function错误在demitri @。其他的,我identical algorithm is to His。P></
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | def find_idx_nearest_val(array, value): idx_sorted = np.argsort(array) sorted_array = np.array(array[idx_sorted]) idx = np.searchsorted(sorted_array, value, side="left") if idx >= len(array): idx_nearest = idx_sorted[len(array)-1] elif idx == 0: idx_nearest = idx_sorted[0] else: if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]): idx_nearest = idx_sorted[idx-1] else: idx_nearest = idx_sorted[idx] return idx_nearest |
这是unutbu答案的矢量化版本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | def find_nearest(array, values): array = np.asarray(array) # the last dim must be 1 to broadcast in (array - values) below. values = np.expand_dims(values, axis=-1) indices = np.abs(array - values).argmin(axis=-1) return array[indices] image = plt.imread('example_3_band_image.jpg') print(image.shape) # should be (nrows, ncols, 3) quantiles = np.linspace(0, 255, num=2 ** 2, dtype=np.uint8) quantiled_image = find_nearest(quantiles, image) print(quantiled_image.shape) # should be (nrows, ncols, 3) |
here is a version of @ dimitri'快速矢量S解决方案如果你have many to search for(
1 2 3 4 5 6 7 8 9 10 11 12 13 | #`values` should be sorted def get_closest(array, values): #make sure array is a numpy array array = np.array(array) # get insert positions idxs = np.searchsorted(array, values, side="left") # find indexes where previous index is closer prev_idx_is_less = ((idxs == len(array))|(np.fabs(values - array[np.maximum(idxs-1, 0)]) < np.fabs(values - array[np.minimum(idxs, len(array)-1)]))) idxs[prev_idx_is_less] -= 1 return array[idxs] |
benchmarksP></
时报>100
1 2 3 4 5 | >>> %timeit ar=get_closest(np.linspace(1, 1000, 100), np.random.randint(0, 1050, (1000, 1000))) 139 ms ± 4.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) >>> %timeit ar=[find_nearest(np.linspace(1, 1000, 100), value) for value in np.random.randint(0, 1050, 1000*1000)] took 21.4 seconds |
所有的答案都有助于收集信息来编写有效的代码。但是,我编写了一个小的python脚本来针对各种情况进行优化。如果对提供的数组进行排序,这将是最好的情况。如果搜索指定值最近点的索引,那么
1 2 3 4 5 6 7 8 | import numpy as np import bisect xarr = np.random.rand(int(1e7)) srt_ind = xarr.argsort() xar = xarr.copy()[srt_ind] xlist = xar.tolist() bisect.bisect_left(xlist, 0.3) |
在[63]中:%时间平分。平分左(xlist,0.3)CPU时间:用户0 ns,系统0 ns,总计0 ns壁厚:22.2μs
1 | np.searchsorted(xar, 0.3, side="left") |
号
在[64]中:%时间np.searchsorted(xar,0.3,side="left")CPU时间:用户0 ns,系统0 ns,总计0 ns壁厚:98.9μs
1 2 | randpts = np.random.rand(1000) np.searchsorted(xar, randpts, side="left") |
%时间np.searchsorted(xar,randpts,side="left")CPU时间:用户4 ms,系统0 ns,总计4 ms墙时间:1.2 ms
如果我们遵循乘法规则,那么numpy应该用大约100毫秒,这意味着要快大约83倍。
我想我会预言的最好方式:P></
1 2 3 4 | num = 65 # Input number array = n.random.random((10))*100 # Given array nearest_idx = n.where(abs(array-num)==abs(array-num).min())[0] # If you want the index of the element of array (array) nearest to the the given number (num) nearest_val = array[abs(array-num)==abs(array-num).min()] # If you directly want the element of array (array) nearest to the given number (num) |
This is the basic队列。你可以使用它,如果你想as a functionP></
可能对
1 2 | def find_nearest(X, value): return X[np.unravel_index(np.argmin(np.abs(X - value)), X.shape)] |
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | import numpy as np def find_nearest(array, value): array = np.array(array) z=np.abs(array-value) y= np.where(z == z.min()) m=np.array(y) x=m[0,0] y=m[1,0] near_value=array[x,y] return near_value array =np.array([[60,200,30],[3,30,50],[20,1,-50],[20,-500,11]]) print(array) value = 0 print(find_nearest(array, value)) |
。