简体   繁体   English

根据各种特征(列)计算pandas的平均值

[英]Calculate the mean value of pandas based on various features(columns)

Goal目标

I am writing a card game analysis scripts.我正在编写纸牌游戏分析脚本。 For convenience, the data was stored in the excel sheets.为方便起见,数据存储在 Excel 表格中。 So users can type in the information of each game into the excel sheets and use the python script to analyze the return of the game.所以用户可以在excel表中输入每场比赛的信息,并使用python脚本来分析比赛的回报。 3 rivals are involved in a card game (4 person in total), and I want to analyze the overall return vs a certain player. 3个对手参与了一场纸牌游戏(总共4个人),我想分析对某个玩家的总体回报。 eg.例如。 I want to know how much my dad has won when play cards with Tom.我想知道我爸爸和汤姆打牌时赢了多少。

Data数据

The excel sheet consists of several features like "date, start_time, end_time, duration, location, Pal1, Pal2, Pal3" and a target "Return" with positive number as gain and negative numbers as loss. Excel 表格包含多个功能,例如“日期、开始时间、结束时间、持续时间、位置、Pal1、Pal2、Pal3”和目标“返回”,其中正数为收益,负数为损失。 The data was read using python pandas.数据是使用 python pandas 读取的。

Problem问题

I did not figure out how to index a certain pal, as he/she may in one of the column "pal#".我不知道如何索引某个朋友,因为他/她可能在“pal#”列之一中。 I need to calculate the mean value of return when a certain pal is involved.当涉及到某个朋友时,我需要计算回报的平均值。

Excel sheets(demo) Excel表格(演示)

在此处输入图片说明

Code代码

path = 'excel.xlsx'
data_df = pd.read_excel(path)
def people_estimation(raw_data, name):
    data = raw_data
    df1 = data.pivot_table(columns=['牌友1'], values='收益', aggfunc=np.mean)
    df2 = data.pivot_table(columns=['牌友2'], values='收益', aggfunc=np.mean)
    df3 = data.pivot_table(columns=['牌友3'], values='收益', aggfunc=np.mean)
    interest = (df1[name] + df2[name] + df3[name])/3
    print("The gain with", name, "is :", interest)

Note笔记

The code above achieve what I want.上面的代码实现了我想要的。 But I think there is a better way to do it.但我认为有更好的方法来做到这一点。 Can anyone help.任何人都可以帮忙。 Thanks in advance.提前致谢。

>>> a
   a  b  c
0  2  2  1
1  3  1  2
2  4  1  3
>>> mask = (a['a']==2) | (a['c']==2)
0     True
1     True
2    False
dtype: bool
>>> a[mask]
   a  b  c
0  2  2  1
1  3  1  2
>>> a[mask]['c']
0    1
1    2
Name: c, dtype: int64
>>> a[mask]['c'].mean()
1.5

I think in your code it's wrong that condition for a mask should by in a bracket.我认为在您的代码中,将掩码的条件放在括号中是错误的。

data[(data['牌友1'] == 'Tom') | (data['牌友2'] == 'Tom') | (data['牌友3'] == 'Tom')]['收益'].mean()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM