[英]How to turn a list of tuples into a “histogram” where the bins contain the tuples?
假設我有一個坐標列表(長度為n
元組),其中n
在運行時確定。 我想本質上構建一個n
維直方圖,但其中的bin不僅僅是計數,而是每個都包含落入該bin的所有坐標元組。
我想要的示例:
輸入:
list: [(-0.308, 0.414), (-0.058, -0.279), (0.860, 0.118), (-0.543, -0.093)]
bin_width: 1
輸出:
[[[(-0.058, -0.279), (-0.543, -0.093)], [(-0.308, 0.414)]], [[], [(0.860, 0.118)]]]
更新:我現在有一個解決方案(請參閱下面的答案)。 雖然,如果您有更好的主意,請分享。 特別是,將此方法轉換為生成器而不是列表會很好。 -這里的示例很簡短,但是我打算使用它的方式,我的輸入列表可能很大,我也只需要使用一次輸出即可。
希望我做對了。
職能:
from math import *
def minmax(coordinate_list): # returns a list of the minimum and maximum
return map(lambda x: (min(x), max(x)), zip(*coordinate_list)) # occuring value of each coordinate of input lists
def find_range(min_max_list): # for each dimension finds the necessary
return map(lambda x, y: ceil(y) - floor(x), *zip(*min_max_list)) # range for the nested list
def find_bin_range(ranges, bin_width): # turns the ranges in coordinate units into ones in terms of bin widths
return [max(r * bin_width, 1) for r in ranges]
def build_bins(bin_ranges): # given a list of ranges, recursively builds a nested list structure to be filled --
if not bin_ranges: # the histogram bins
return []
return [build_bins(bin_ranges[1:]) for _ in range(ceil(bin_ranges[0]))]
def access_bin(coordinates, key, bins, bin_width, min_max_list): # recursively accesses each bin
if not key: # and fills it with coordinate
bins.append(coordinates)
else:
minimum, _ = min_max_list[0]
i = int((key[0] - floor(minimum)) * bin_width)
return access_bin(coordinates, key[1:], bins[i], bin_width, min_max_list[1:])
def fill_bins(coordinate_list, bins, bin_width, min_max_list): # fills each bin with appropriate coordinates
for coordinates in coordinate_list:
access_bin(coordinates, coordinates, bins, bin_width, min_max_list)
return bins
def coordinate_list_to_bins(coordinate_list, bin_width): # the complete procedure
min_max_list = list(minmax(coordinate_list))
ranges = find_range(min_max_list)
bin_ranges = find_bin_range(ranges)
bins = build_bins(bin_ranges)
return fill_bins(coordinate_list, bins, bin_width, min_max_list)
用法:
import random
coordinate_list = [(random.uniform(-1, 1), random.uniform(-.5, .5)) for _ in range(4)]
bin_width = 1
print(coordinate_list)
print(coordinate_list_to_bins(coordinate_list, bin_width))
輸出:
[(0.197, 0.278), (0.333, -0.030), (0.363, -0.298), (0.553, -0.286)]
[[[(0.333, -0.030), (0.363, -0.298), (0.553, -0.286)], [(0.197, 0.278)]]]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.