在numpy數組中找到“接近值”的int值並將其組合

Question

我有一個具有以下值的numpy數組：[10620.5、11899。，11879.5、13017。，11610.5]

import Numpy as np
array = np.array([10620.5, 11899,  11879.5, 13017,  11610.5])

我想獲取“接近”的值（在本例中為11899和11879）並取它們的平均值，然后用新數字的單個實例替換它們，結果是：

[10620.5, 11889, 13017, 11610.5]

術語“關閉”是可配置的。 假設相差50

這樣做的目的是在Bokah圖上創建跨度，有些線太近了

總的來說，我對python來說是超級新手（密集開發的幾個星期）

我認為我可以按順序排列這些值，然后以某種方式左右抓住一個值，並對它們進行一些數學運算，用平均值代替一個匹配項。 但是目前，我還沒有任何想法。

Answer 1

嘗試這樣的事情，我添加了一些額外的步驟，只是為了展示流程：其想法是將數據分組為相鄰的組，然后根據它們的分布范圍決定是否要對它們進行分組。

因此，正如您所描述的，您可以將數據組合為3個數字，如果最大和最小數字之間的差小於50，請對它們取平均值，否則將其保留不變。

import pandas as pd
import numpy as np
arr = np.ravel([1,24,5.3, 12, 8, 45, 14, 18, 33, 15, 19, 22])
arr.sort()

def reshape_arr(a, n): # n is number of consecutive adjacent items you want to compare for averaging
    hold = len(a)%n
    if hold != 0:
        container = a[-hold:] #numbers that do not fit on the array will be excluded for averaging
        a = a[:-hold].reshape(-1,n)
    else:
        a = a.reshape(-1,n)
        container = None
    return a, container
def get_mean(a, close): # close = how close adjacent numbers need to be, in order to be averaged together
    my_list=[]
    for i in range(len(a)):
        if a[i].max()-a[i].min() > close:
            for j in range(len(a[i])):
                my_list.append(a[i][j])
        else:
            my_list.append(a[i].mean())
    return my_list  
def final_list(a, c): # add any elemts held in the container to the final list
    if c is not None:
        c = c.tolist()
        for i in range(len(c)):
            a.append(c[i])
    return a 

arr, container = reshape_arr(arr,3)
arr = get_mean(arr, 5)
final_list(arr, container)

Answer 2

您可以在此處使用Fuzzywuzzy來評估2個數據集之間的接近度比率。

在此處查看詳細信息： http : //jonathansoma.com/lede/algorithms-2017/classes/fuzziness-matplotlib/fuzzing-matching-in-pandas-with-fuzzywuzzy/

Answer 3

采納古斯塔沃（Gustavo）的答案，並根據我的需要進行調整：

def reshape_arr(a, close):
    flag = True
    while flag is not False:
        array = a.sort_values().unique()
        l = len(array)
        flag = False
        for i in range(l):
            previous_item = next_item = None
            if i > 0:
                previous_item = array[i - 1]
            if i < (l - 1):
                next_item = array[i + 1]
            if previous_item is not None:
                if abs(array[i] - previous_item) < close:
                    average = (array[i] + previous_item) / 2
                    flag = True
                    #find matching values in a, and replace with the average
                    a.replace(previous_item, value=average, inplace=True)
                    a.replace(array[i], value=average, inplace=True)

            if next_item is not None:
                if abs(next_item - array[i]) < close:
                    flag = True
                    average = (array[i] + next_item) / 2
                    # find matching values in a, and replace with the average
                    a.replace(array[i], value=average, inplace=True)
                    a.replace(next_item, value=average, inplace=True)
    return a

如果我做這樣的事情，這將做到：

 candlesticks['support'] = reshape_arr(supres_df['support'], 150)

其中燭台是我正在使用的主要DataFrame，supres_df是我在將其應用到主要對象之前要按摩的另一個DataFrame。

它可以工作，但是非常慢。 我正在嘗試對其進行優化。

我添加了一個while循環，因為求平均值后，平均值可以變得足夠接近以再次求平均值，因此我將再次循環，直到不再需要求平均值為止。 這是新手的全部工作，因此，如果您看到愚蠢的內容，請發表評論。

在numpy數組中找到“接近值”的int值並將其組合

問題描述

3 個解決方案

解決方案1
1 2019-07-12 17:54:23

解決方案2
0 2019-07-12 16:22:00

解決方案3
0 已采納 2019-08-05 19:20:29

在numpy數組中找到“接近值”的int值並將其組合

問題描述

3 個解決方案

解決方案1 1 2019-07-12 17:54:23

解決方案2 0 2019-07-12 16:22:00

解決方案3 0 已采納 2019-08-05 19:20:29

解決方案1
1 2019-07-12 17:54:23

解決方案2
0 2019-07-12 16:22:00

解決方案3
0 已采納 2019-08-05 19:20:29