关于python:在numpy数组中查找最接近的值

Find nearest value in numpy array

是否有一种麻木的方法(例如函数)来查找数组中最近的值?

例子:

1
np.find_nearest( array, value )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

array = np.random.random(10)
print(array)
# [ 0.21069679  0.61290182  0.63425412  0.84635244  0.91599191  0.00213826
#   0.17104965  0.56874386  0.57319379  0.28719469]

value = 0.5

print(find_nearest(array, value))
# 0.568743859261


如果你的甚大阵列类is和is this is a多,更快的解决方案:P></

1
2
3
4
5
6
def find_nearest(array,value):
    idx = np.searchsorted(array, value, side="left")
    if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):
        return array[idx-1]
    else:
        return array[idx]

这arrays规模甚大的。你可以easily modify the above the method to sort在如果你不能承担that is already数组类。这是小arrays overkill for this,but they get is盎司大快多了。P></


答案与slight modification,the above工厂与任意三维arrays of(1D,2D,3D,…)P></

1
2
3
4
def find_nearest(a, a0):
   "Element in nd array `a` closest to the scalar value `a0`"
    idx = np.abs(a - a0).argmin()
    return a.flat[idx]

现在,单身在线:as a writtenP></

1
a.flat[np.abs(a - a0).argmin()]


答案:如果一个summary of the then has a类arraybisection队列(given the fastest performs below)。~快100~1000 arrays for大时代2 -,和100 ~快换小arrays时报。它does not require numpy或者。如果你有一unsorted array那么大arrayis if using an one should考虑第一,O(N logN)bisection sort和我,如果我arraymethod is the fastest 2似乎是小。P></

你应该做你的第一clarify近邻均值模式值。often the rian abscissa人想要一个区间,例如阵列0,0.7,2.1 = [答案]值= 1.95,idx=1,会好的。This is the suspect You need that的房子(otherwise the following can be modified with a跟文甚easily You find the statement盎司区间条件)。会做笔记,this is the Way to最优bisection(with which将提供它does not require一线知名numpy at all is using numpy布尔函数更快因为他们比我做作业)。我将提供给在这里时间人比较对其他用户模式。P></

bisection:P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def bisection(array,value):
    '''Given an ``array`` , and given a ``value`` , returns an index j such that ``value`` is between array[j]
    and array[j+1]. ``array`` must be monotonic increasing. j=-1 or j=len(array) is returned
    to indicate that ``value`` is out of range below and above respectively.'''

    n = len(array)
    if (value < array[0]):
        return -1
    elif (value > array[n-1]):
        return n
    jl = 0# Initialize lower
    ju = n-1# and upper limits.
    while (ju-jl > 1):# If we are not yet done,
        jm=(ju+jl) >> 1# compute a midpoint with a bitshift
        if (value >= array[jm]):
            jl=jm# and replace either the lower limit
        else:
            ju=jm# or the upper limit, as appropriate.
        # Repeat until the test condition is satisfied.
    if (value == array[0]):# edge cases at bottom
        return 0
    elif (value == array[n-1]):# and top
        return n-1
    else:
        return jl

现在define the队列11 each other from the answers,他们返回安指数:P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import math
import numpy as np

def find_nearest1(array,value):
    idx,val = min(enumerate(array), key=lambda x: abs(x[1]-value))
    return idx

def find_nearest2(array, values):
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    return indices

def find_nearest3(array, values):
    values = np.atleast_1d(values)
    indices = np.abs(np.int64(np.subtract.outer(array, values))).argmin(0)
    out = array[indices]
    return indices

def find_nearest4(array,value):
    idx = (np.abs(array-value)).argmin()
    return idx


def find_nearest5(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest

def find_nearest6(array,value):
    xi = np.argmin(np.abs(np.ceil(array[None].T - value)),axis=0)
    return xi

现在我的时间码:the已知的方法均不正确给the区间。方法对4轮的近邻点阵列(例如,1.5—> > = 2),和5例(例如法总是发→2号)。方法和存储器3,6,和bisection of the properly给原来的区间。P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
array = np.arange(100000)
val = array[50000]+0.55
print( bisection(array,val))
%timeit bisection(array,val)
print( find_nearest1(array,val))
%timeit find_nearest1(array,val)
print( find_nearest2(array,val))
%timeit find_nearest2(array,val)
print( find_nearest3(array,val))
%timeit find_nearest3(array,val)
print( find_nearest4(array,val))
%timeit find_nearest4(array,val)
print( find_nearest5(array,val))
%timeit find_nearest5(array,val)
print( find_nearest6(array,val))
%timeit find_nearest6(array,val)

(50000, 50000)
100000 loops, best of 3: 4.4 μs per loop
50001
1 loop, best of 3: 180 ms per loop
50001
1000 loops, best of 3: 267 μs per loop
[50000]
1000 loops, best of 3: 390 μs per loop
50001
1000 loops, best of 3: 259 μs per loop
50001
1000 loops, best of 3: 1.21 ms per loop
[50000]
1000 loops, best of 3: 746 μs per loop

在大型阵列的bisection for next to 180us给4us compared最佳和最长(100~1000 1.21ms时代更快)。这是arrays for ~ 2小时报- 100更快。P></


to find the延伸是安安最近邻矢量在矢量阵)。P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import numpy as np

def find_nearest_vector(array, value):
  idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
  return array[idx]

A = np.random.random((10,2))*100
""" A = array([[ 34.19762933,  43.14534123],
   [ 48.79558706,  47.79243283],
   [ 38.42774411,  84.87155478],
   [ 63.64371943,  50.7722317 ],
   [ 73.56362857,  27.87895698],
   [ 96.67790593,  77.76150486],
   [ 68.86202147,  21.38735169],
   [  5.21796467,  59.17051276],
   [ 82.92389467,  99.90387851],
   [  6.76626539,  30.50661753]])"""

pt = [6, 30]  
print find_nearest_vector(A,pt)
# array([  6.76626539,  30.50661753])


如果你不想使用numpy恩:这会给你P></

1
2
3
4
def find_nearest(array, value):
    n = [abs(i-value) for i in array]
    idx = n.index(min(n))
    return array[idx]


handle that will版本是在非标量阵列:"值"P></

1
2
3
4
5
import numpy as np

def find_nearest(array, values):
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    return array[indices]

现在版本(例如,int型的返回值,在输入是标量浮法)if the:P></

1
2
3
4
5
def find_nearest(array, values):
    values = np.atleast_1d(values)
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    out = array[indices]
    return out if len(out) > 1 else out[0]


SciPy for version with在这里是"阿里"to find the onasafari,回答近邻矢量在矢量阵of an"P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
In [1]: from scipy import spatial

In [2]: import numpy as np

In [3]: A = np.random.random((10,2))*100

In [4]: A
Out[4]:
array([[ 68.83402637,  38.07632221],
       [ 76.84704074,  24.9395109 ],
       [ 16.26715795,  98.52763827],
       [ 70.99411985,  67.31740151],
       [ 71.72452181,  24.13516764],
       [ 17.22707611,  20.65425362],
       [ 43.85122458,  21.50624882],
       [ 76.71987125,  44.95031274],
       [ 63.77341073,  78.87417774],
       [  8.45828909,  30.18426696]])

In [5]: pt = [6, 30]  # <-- the point to find

In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point
Out[6]: array([  8.45828909,  30.18426696])

#how it works!
In [7]: distance,index = spatial.KDTree(A).query(pt)

In [8]: distance # <-- The distances to the nearest neighbors
Out[8]: 2.4651855048258393

In [9]: index # <-- The locations of the neighbors
Out[9]: 9

#then
In [10]: A[index]
Out[10]: array([  8.45828909,  30.18426696])


大arrays for the(优秀),demitri is given by"的答案回答marked as the currently更快比最佳。我在适应他的精确算法下面两种:P></

  • whether or not below the function工厂输入阵列is the类。P></

  • the function below the index of the归来的对应输入阵列closest which is to the value,通用somewhat黑莓。P></

  • 注that the function below,特异性也会在房屋边把手铅to a written by the original function错误在demitri @。其他的,我identical algorithm is to His。P></

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    def find_idx_nearest_val(array, value):
        idx_sorted = np.argsort(array)
        sorted_array = np.array(array[idx_sorted])
        idx = np.searchsorted(sorted_array, value, side="left")
        if idx >= len(array):
            idx_nearest = idx_sorted[len(array)-1]
        elif idx == 0:
            idx_nearest = idx_sorted[0]
        else:
            if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
                idx_nearest = idx_sorted[idx-1]
            else:
                idx_nearest = idx_sorted[idx]
        return idx_nearest


    这是unutbu答案的矢量化版本:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    def find_nearest(array, values):
        array = np.asarray(array)

        # the last dim must be 1 to broadcast in (array - values) below.
        values = np.expand_dims(values, axis=-1)

        indices = np.abs(array - values).argmin(axis=-1)

        return array[indices]


    image = plt.imread('example_3_band_image.jpg')

    print(image.shape) # should be (nrows, ncols, 3)

    quantiles = np.linspace(0, 255, num=2 ** 2, dtype=np.uint8)

    quantiled_image = find_nearest(quantiles, image)

    print(quantiled_image.shape) # should be (nrows, ncols, 3)

    here is a version of @ dimitri'快速矢量S解决方案如果你have many to search for(valuesvaluescan be多维数组):P></

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    #`values` should be sorted
    def get_closest(array, values):
        #make sure array is a numpy array
        array = np.array(array)

        # get insert positions
        idxs = np.searchsorted(array, values, side="left")

        # find indexes where previous index is closer
        prev_idx_is_less = ((idxs == len(array))|(np.fabs(values - array[np.maximum(idxs-1, 0)]) < np.fabs(values - array[np.minimum(idxs, len(array)-1)])))
        idxs[prev_idx_is_less] -= 1

        return array[idxs]

    benchmarksP></

    时报>100 for更快比使用的解决方案是与环demitri' @P></

    1
    2
    3
    4
    5
    >>> %timeit ar=get_closest(np.linspace(1, 1000, 100), np.random.randint(0, 1050, (1000, 1000)))
    139 ms ± 4.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

    >>> %timeit ar=[find_nearest(np.linspace(1, 1000, 100), value) for value in np.random.randint(0, 1050, 1000*1000)]
    took 21.4 seconds


    所有的答案都有助于收集信息来编写有效的代码。但是,我编写了一个小的python脚本来针对各种情况进行优化。如果对提供的数组进行排序,这将是最好的情况。如果搜索指定值最近点的索引,那么bisect模块最省时。当搜索与数组相对应的索引时,numpy searchsorted是最有效的。

    1
    2
    3
    4
    5
    6
    7
    8
    import numpy as np
    import bisect
    xarr = np.random.rand(int(1e7))

    srt_ind = xarr.argsort()
    xar = xarr.copy()[srt_ind]
    xlist = xar.tolist()
    bisect.bisect_left(xlist, 0.3)

    在[63]中:%时间平分。平分左(xlist,0.3)CPU时间:用户0 ns,系统0 ns,总计0 ns壁厚:22.2μs

    1
    np.searchsorted(xar, 0.3, side="left")

    在[64]中:%时间np.searchsorted(xar,0.3,side="left")CPU时间:用户0 ns,系统0 ns,总计0 ns壁厚:98.9μs

    1
    2
    randpts = np.random.rand(1000)
    np.searchsorted(xar, randpts, side="left")

    %时间np.searchsorted(xar,randpts,side="left")CPU时间:用户4 ms,系统0 ns,总计4 ms墙时间:1.2 ms

    如果我们遵循乘法规则,那么numpy应该用大约100毫秒,这意味着要快大约83倍。


    我想我会预言的最好方式:P></

    1
    2
    3
    4
     num = 65 # Input number
     array = n.random.random((10))*100 # Given array
     nearest_idx = n.where(abs(array-num)==abs(array-num).min())[0] # If you want the index of the element of array (array) nearest to the the given number (num)
     nearest_val = array[abs(array-num)==abs(array-num).min()] # If you directly want the element of array (array) nearest to the given number (num)

    This is the basic队列。你可以使用它,如果你想as a functionP></


    可能对ndarrays有帮助:

    1
    2
    def find_nearest(X, value):
        return X[np.unravel_index(np.argmin(np.abs(X - value)), X.shape)]


    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    import numpy as np
    def find_nearest(array, value):
        array = np.array(array)
        z=np.abs(array-value)
        y= np.where(z == z.min())
        m=np.array(y)
        x=m[0,0]
        y=m[1,0]
        near_value=array[x,y]

        return near_value

    array =np.array([[60,200,30],[3,30,50],[20,1,-50],[20,-500,11]])
    print(array)
    value = 0
    print(find_nearest(array, value))