[英]Pandas has two dataframes, want the average of the divisions between each group
I have a dataframe like this:我有一个这样的 dataframe:
dataA = [["A1", "t1", 5], ["A1", "t2", 8], ["A1", "t3", 7],
["A1","t4", 4], ["A1", "t5", 2], ["A1", "t6", 2],
["A2", "t1", 15], ["A2", "t2", 6], ["A2", "t3", 1],
["A2", "t4", 11], ["A2", "t5", 12], ["A2", "t6", 7],
["A3", "t1", 12], ["A3", "t2", 8], ["A3", "t3", 3],
["A3", "t4", 7], ["A3", "t5", 15], ["A3", "t6", 14]]
dataB = [["B1", "t1", 2], ["B1", "t2", 9], ["B1", "t3", 17],
["B1","t4", 14], ["B1", "t5", 32], ["B1", "t6", 3],
["B2", "t1", 44], ["B2", "t2", 36], ["B2", "t3", 51],
["B2", "t4", 81], ["B2", "t5", 82]]
data1 = pd.DataFrame(data = dataA, columns=["An", "colA", "Val"])
data2 = pd.DataFrame(data = dataB, columns=["Bm", "colA", "Val"])
How to get this result:如何得到这个结果:
GroupA | GroupB| result |
---------------------------
| A1 | B1 | val_11 |
--------------------------
| A1 | B2 | val_12 |
--------------------------
| A2 | B1 | val_21 |
--------------------------
| A2 | B2 | val_22 |
--------------------------
| A3 | B1 | val_31 |
--------------------------
| A3 | B2 | val_32 |
...........................
| An | Bm | val_nm |
The way calculate val_nm as follows: val_11 is equal to the column mean value of the column value of A1 divided by the column value of B1, Note that the column A1 divided by the column B1, the corresponding number is divided by the result, if it is greater than 1, take the reciprocal , and then find the average of the result So whether A1 is divided by B1 or B1 is divided by A1, the result value must be the same. val_nm的计算方式如下: val_11等于A1的列值除以B1的列值的列平均值,注意是A1列除以B1列,对应的数除以结果,如果大于1,取倒数,然后求结果的平均值 所以不管是A1除以B1还是B1除以A1,结果值一定是一样的。
In order to calculate val, it may be necessary to define a function, val is greater than 0, there will be no division by 0为了计算val,可能需要定义一个function,val大于0,就不会被0除
I take val_11 as example我以 val_11 为例
A1[5,8,7,4,2,2] B1[2,9,17,14,32,3] A1[5,8,7,4,2,2] B1[2,9,17,14,32,3]
val_11 =avg (A1/B1) =avg( 5/2 take 2/5 + 8/9 +7/17 + 4/15 +2/32 +2/3) val_11 =avg (A1/B1) =avg( 5/2 取 2/5 + 8/9 +7/17 + 4/15 +2/32 +2/3)
= 0.4525 = 0.4525
so no matter A1/B1 or B1/A1, result will be the same所以无论A1/B1还是B1/A1,结果都是一样的
please help me caculate result请帮我计算结果
Taking the straight definition of what you want to calculate直接定义要计算的内容
pivot()
pivot()
创建表merge()
on a synthetic column foomerge()
之间做笛卡尔积def meanofdiv(dfa):
a = dfa.loc[:,[c for c in dfa.columns if "_A" in c]].values
b = dfa.loc[:,[c for c in dfa.columns if "_B" in c]].values
return np.where((a/b)>1, b/a, a/b).mean(axis=1)
# pivot key/val pair data to tables
# caretesian product of tables
# simple calculation of columns from A and a column from B
dfr = pd.merge(
data1.pivot(index="An", columns="colA", values="Val").reset_index().assign(foo=1),
data2.pivot(index="Bm", columns="colA", values="Val").reset_index().assign(foo=1),
on="foo",
suffixes=("_A","_B")
).assign(resname=lambda dfa: dfa["An"]+dfa["Bm"],
res=meanofdiv)
dfr.loc[:,["An","Bm","res"]]
An![]() |
Bm ![]() |
res![]() |
|
---|---|---|---|
0 ![]() |
A1 ![]() |
B1 ![]() |
0.452589 ![]() |
1 ![]() |
A1 ![]() |
B2 ![]() |
0.202259 ![]() |
2 ![]() |
A2 ![]() |
B1 ![]() |
0.408018 ![]() |
3 ![]() |
A2 ![]() |
B2 ![]() |
0.206316 ![]() |
4 ![]() |
A3 ![]() |
B1 ![]() |
0.40251 ![]() |
5 ![]() |
A3 ![]() |
B2 ![]() |
0.172901 ![]() |
apply(axis=1)
apply(axis=1)
def meanofdiv(dfa):
dfa = dfa.to_frame().T
a = dfa.loc[:,[c for c in dfa.columns if "_A" in c]].astype(float).values[0]
b = dfa.loc[:,[c for c in dfa.columns if "_B" in c]].astype(float).values[0]
a = a[~np.isnan(b)]
b = b[~np.isnan(b)]
return np.where((a/b)>1, b/a, a/b).mean()
# pivot key/val pair data to tables
# caretesian product of tables
# simple calculation of columns from A and a column from B
dfr = pd.merge(
data1.pivot(index="An", columns="colA", values="Val").reset_index().assign(foo=1),
data2.pivot(index="Bm", columns="colA", values="Val").reset_index().assign(foo=1),
on="foo",
suffixes=("_A","_B")
).assign(resname=lambda dfa: dfa["An"]+dfa["Bm"],
res=lambda dfa: dfa.apply(meanofdiv, axis=1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.