簡體   English   中英

將值數字化為“ floor” bin python

[英]Digitizing value to “floor” bin python

我需要數字化一些值,以使返回的索引是“ floor”或“ ceiling” bin。

例如,對於bins = numpy.array([0.0, 0.5, 1.0, 1.5, 2.0])和值0.2我期望索引為0 ,對於值0.26 ,返回的索引應為1 ,依此類推。

我有以下看起來很丑陋的功能來做我想做的事情:

import numpy

def get_bin_index(value, bins):
    bin_diff = bins[1]-bins[0]
    index = numpy.digitize(value, bins)
    if bins[index] - value > bin_diff/2.0:
        index -= 1
    return index

是否有任何簡潔(更好/有效的閱讀方法)來做到這一點?


編輯:包括時間值(僅滿足我的好奇心!)

In [1]: def get_bin_index(value, bins):
    ...:     bin_diff = bins[1]-bins[0]
    ...:     index = numpy.digitize(value, bins)
    ...:     if bins[index] - value > bin_diff/2.0:
    ...:         index -= 1
    ...:     return index
    ...:

In [2]: def get_bin_index_c(value, bins):
    ...:     return numpy.rint((value-bins[0])/(bins[1]-bins[0]))
    ...:

In [3]: def get_bin_index_mid_digitized(value, bins):
    ...:     return numpy.digitize(0.6, (bins[1:] + bins[:-1])/2.0)
    ...:

In [4]: bin_halfs = numpy.array([0.0, 0.5, 1.0, 1.5, 2.0])

In [5]: %timeit get_bin_index(0.9, bin_halfs)
The slowest run took 5.71 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 4.93 µs per loop

In [6]: %timeit get_bin_index_c(0.9, bin_halfs)
The slowest run took 14.60 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.34 µs per loop

In [7]: %timeit get_bin_index_mid_digitized(0.9, bin_halfs)
The slowest run took 4.09 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.37 µs per loop

您可以簡單地獲取垃圾箱的中間位置,並與np.digitize使用-

np.digitize(value, (bins[1:] + bins[:-1])/2.0)

如果bin_diffs都相同,則可以通過以下方式在恆定時間內執行此操作:

def get_bin_index2(value, bins):
    return numpy.rint((value - bins[0])/(bins[1]-bins[0]))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM