[英]Aggregate contents of a column based on the range of values in another column in Pandas
I am working on aggregating the contents of a dataframe based on the range of values in a given column.我正在根据给定列中的值范围聚合 dataframe 的内容。 My
df
looks like given below:我的
df
如下所示:
min max names
1 5 ['a','b']
0 5 ['d']
6 8 ['a','c']
3 4 ['e','a']
The output expected is预期的 output 是
min=0
and max=5
, get the aggregated value, so the names value will be ['a','b','d','e','a']
min=0
和max=5
,获取聚合值,因此名称值为['a','b','d','e','a']
min=5
and max=10
, get the aggregated value, the names value will be ['a','d']
min=5
和max=10
,获取聚合值,名称值为['a','d']
Any help is appreciated.任何帮助表示赞赏。
The most intuitive approach would be to filter and then aggregate.最直观的方法是过滤然后聚合。 To solve your specific problem, I would do this:
为了解决您的具体问题,我会这样做:
>> df = pd.DataFrame({"min": [1, 0, 6, 3],
"max": [5, 5, 8, 4],
"value": [['a','b'], ['d'], ['a','c'], ['e','a']]})
>> print(df)
min max value
0 1 5 [a, b]
1 0 5 [d]
2 6 8 [a, c]
3 3 4 [e, a]
>> sum_filtered_values = df[(df["max"]<=5) & (df["min"]>=0)].value.sum()
>> print(sum_filtered_values)
['a', 'b', 'd', 'e', 'a']
>> sum_filtered_values = df[(df["max"]<=10) & (df["min"]>=5)].value.sum()
>> print(sum_filtered_values)
['a', 'c']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.