如何使用Pandas有效地将值分成重叠的bin？

Question

I would like to bin all the values from a column of type float into bins that are overlapping. 我想将float类型的所有值都绑定到重叠的bin中。 The resulting column could be a series of 1-D vectors with bools - one vector for each value from the original column. 得到的列可以是一系列带有bool的1-D向量 - 来自原始列的每个值的一个向量。 The resulting vectors contain True for each bin a value falls into and False for the other bins. 所产生的载体含有True的值落入每个bin和False的其他段。

For example, if I have four bins [(0, 10), (7, 20), (15, 30), (30, 60)] , and the original value is 9.5, the resulting vector should be [True, True, False, False] . 例如，如果我有四个区间[(0, 10), (7, 20), (15, 30), (30, 60)] 0,10 [(0, 10), (7, 20), (15, 30), (30, 60)] ，并且原始值为9.5，则结果向量应为[True, True, False, False] 。

I know how to iterate through all the ranges with a custom function using 'apply', but is there a way to perform this binning more efficiently and concisely? 我知道如何使用'apply'使用自定义函数遍历所有范围，但有没有办法更有效，更简洁地执行此分区？

Answer 1

Would a simple list comprehension meet your needs? 简单的列表理解能满足您的需求吗？

Bins = [(0, 10), (7, 20), (15, 30), (30, 60)]
Result = [((9.5>=y[0])&(9.5<=y[1])) for y in Bins]

If your data is stored in column data of a pandas DataFrame ( df ) then you can define the function: 如果您的数据存储在pandas DataFrame（ df ）的列data ，那么您可以定义该函数：

def in_ranges(x,bins):
    return [((x>=y[0])&(x<=y[1])) for y in bins]

and apply it to the column: 并将其应用于列：

df[data].apply(lambda x: pd.Series(in_ranges(x,Bins),Bins))

如何使用Pandas有效地将值分成重叠的bin？

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-05-16 15:34:24

如何使用Pandas有效地将值分成重叠的bin？

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-05-16 15:34:24

解决方案1
2 已采纳 2017-05-16 15:34:24