[英]Get a combination of values in dataframe and implement a function
I want to take the combination of values in a column and apply a function to each combination.What is the easiest way to do this?我想获取列中的值组合并将 function 应用于每个组合。最简单的方法是什么?
Example Data示例数据
| name | value |
|------|-------|
| 6A | 1 |
| 6A | 1 |
| 6A | 1 |
| 6B | 3 |
| 6B | 3 |
| 6B | 3 |
| 6C | 7 |
| 6C | 5 |
| 6C | 4 |
The Result I Want我想要的结果
i used sum as a function in the example:我在示例中使用 sum 作为 function :
| pair | result |
|-------|--------|
| 6A_6B | 4 |
| 6A_6B | 4 |
| 6A_6B | 4 |
| 6A_6C | 8 |
| 6A_6C | 6 |
| 6A_6C | 5 |
| 6B_6C | 10 |
| 6B_6C | 8 |
| 6B_6C | 7 |
Note笔记
My function takes "pandas.Series" as parameters.我的 function 将“pandas.Series”作为参数。
For example:例如:
x =a series of "6A" x =一系列“6A”
and和
y =a series of "6B" y =一系列“6B”
6A_6B = sum(x,y)
I find it more straightforward to reshape the data, then it's a simple addition of all pairwise combinations.我发现重塑数据更直接,然后是所有成对组合的简单相加。
import pandas as pd
from itertools import combinations
u = (df.assign(idx = df.groupby('name').cumcount()+1)
.pivot(index='idx', columns='name', values='value'))
#name 6A 6B 6C
#idx
#1 1 3 7
#2 1 3 5
#3 1 3 4
l = []
for items in combinations(u.columns, 2):
l.append(u.loc[:, items].sum(1).to_frame('result').assign(pair='_'.join(items)))
df = pd.concat(l)
result pair
idx
1 4 6A_6B
2 4 6A_6B
3 4 6A_6B
1 8 6A_6C
2 6 6A_6C
3 5 6A_6C
1 10 6B_6C
2 8 6B_6C
3 7 6B_6C
itertools.combinations
Off the top of my head在我的头顶
from itertools import combinations
g = dict(tuple(df.groupby('name')))
pd.DataFrame([
(f'{x}_{y}', a + b)
for x, y in combinations(g, 2)
for a, b in zip(g[x]['value'], g[y]['value'])
], columns=df.columns)
name value
0 6A_6B 4
1 6A_6B 4
2 6A_6B 4
3 6A_6C 8
4 6A_6C 6
5 6A_6C 5
6 6B_6C 10
7 6B_6C 8
8 6B_6C 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.