将新列添加到 Pandas dataframe，其值来自 function

Question

I know this is similar to other questions but I can't find a solution that I can make work.我知道这与其他问题类似，但我找不到可以解决的问题。

I have a dataframe that contains grades that looks similar to this:我有一个 dataframe，其中包含看起来与此类似的成绩：

  subj1 subj2 subj3 subj4
0   A     B     A     B
1   B     B     C     B
2   C     C     B     A

I want to append a GPA score in a new column so that the result is this:我想要 append 新列中的 GPA 分数，这样结果是这样的：

  subj1 subj2 subj3 subj4 GPA
0   A     B     A     B   3.5
1   B     B     C     B   2.8
2   C     D     B     A   2.5

the function I use to calculate the GPA is this:我用来计算 GPA 的 function 是这样的：

def calcgpa():
    for row in df.itertuples(index=False):
        tot = 0
        c = 0
        GPA = 0
        for i in range(len(row)):
            if row[i] == "A":
                tot = tot + 4
                c += 1
            elif row[i] == "B":
                tot = tot + 3
                c += 1
            elif row[i] == "C":
                tot = tot + 2
                c += 1
            elif row[i] == "D":
                tot = tot + 1
                c += 1
            else:
                c += 1
        GPA = tot / c
        return GPA

I thought that df["GPA"] = pd.Series(calcgpa()) would work but it only adds a value to the first row.我认为df["GPA"] = pd.Series(calcgpa())会起作用，但它只会向第一行添加一个值。 All others are NaN.所有其他的都是 NaN。 Trying to use pd.apply or pd.assign just gave me an AssertionError.尝试使用 pd.apply 或 pd.assign 只是给了我一个 AssertionError。

Is the problem with how the function returns the GPA or what is the proper syntax I need to add the new column?问题是 function 如何返回 GPA 还是我需要添加新列的正确语法是什么？

Answer 1

Assuming you only have AE, if you have anything else, ensure you replace them wite zero first, you can then do:假设您只有 AE，如果您还有其他任何东西，请确保先将它们替换为零，然后您可以执行以下操作：

df['GPA'] = df.replace({'A':4,'B':3,'C':2, 'D':1, 'E':0}).mean(1) df['GPA'] = df.replace({'A':4,'B':3,'C':2, 'D':1, 'E':0}).mean(1)

df 
  subj1 subj2 subj3 subj4   GPA
0     A     B     A     B  3.50
1     B     B     C     B  2.75
2     C     C     B     A  2.75

Answer 2

If you look at the output of calcgpa() , it is a single float: 3.5 not a list of GPAs, hence why your output only gives 1 value, then Nans.如果您查看calcgpa()的 output，它是一个浮点数： 3.5而不是 GPA 列表，因此您的 output 只给出 1 个值，然后是 Nans。

I would suggest for your code you need to store each GPA value to a list, and assign that as the column instead.我建议您的代码需要将每个 GPA 值存储到一个列表中，并将其分配为列。 This requires some small changes to your code:这需要对您的代码进行一些小的更改：

replacing GPA = 0 with GPA = [] to turn it into a list and moving this to the top of the function, outside of both for loops.将GPA = 0替换为GPA = []以将其转换为列表并将其移动到 function 的顶部，在两个 for 循环之外。 Then change GPA = tot/c to GPA.append(tot / c) to append each GPA to the list to be assigned as the new GPA column.然后将GPA = tot/c更改为GPA.append(tot / c) to append 每个 GPA 到要分配为新 GPA 列的列表。

Full code:完整代码：

def calcgpa():
    GPA = []
    for row in df.itertuples(index=False):
        tot = 0
        c = 0
        for i in range(len(row)):
            if row[i] == "A":
                tot = tot + 4
                c += 1
            elif row[i] == "B":
                tot = tot + 3
                c += 1
            elif row[i] == "C":
                tot = tot + 2
                c += 1
            elif row[i] == "D":
                tot = tot + 1
                c += 1
            else:
                c += 1
        GPA.append(tot / c)
    return GPA

You can then assign this to the GPA column like this:然后，您可以像这样将其分配给 GPA 列：

df["GPA"] = calcgpa()

Output: Output：

  subj1 subj2 subj3 subj4   GPA
0     A     B     A     B  3.50
1     B     B     C     B  2.75
2     C     C     B     A  2.75

As posted in the other answer, there are more efficient ways to achieve this, but as your code was close I thought I would amend that to achieve the result正如在另一个答案中发布的那样，有更有效的方法可以实现这一点，但由于您的代码很接近，我想我会修改它以实现结果

将新列添加到 Pandas dataframe，其值来自 function

问题描述

2 个解决方案

解决方案1
1 2022-04-27 14:52:37

解决方案2
1 已采纳 2022-04-27 15:00:26

将新列添加到 Pandas dataframe，其值来自 function

问题描述

2 个解决方案

解决方案1 1 2022-04-27 14:52:37

解决方案2 1 已采纳 2022-04-27 15:00:26

解决方案1
1 2022-04-27 14:52:37

解决方案2
1 已采纳 2022-04-27 15:00:26