Python-根据条件将一列添加到包含来自另一行的值的数据框

Question

My dataframe looks like this: 我的数据框如下所示：

+-----+-------+----------+-------+
| No  | Group | refGroup | Value |
+-----+-------+----------+-------+
| 123 | A1    | A1       |   5.0 |
| 123 | B1    | A1       |   7.3 |
| 123 | B2    | A1       |   8.9 |
| 123 | B3    | B1       |   7.9 |
| 465 | A1    | A1       |   1.4 |
| 465 | B1    | A1       |   4.5 |
| 465 | B2    | B1       |   7.3 |
+-----+-------+----------+-------+

Now I need to add another column which conatains the difference between the value of column Value from the current row and the value of column Value from the row with the same number ( No ) and the group ( Group ) that is written in refGroup . 现在我需要添加哪些conatains列的值之间的差额另一列Value从目前的行和列的值Value从该行具有相同数量的（ No ）和组（ Group上所写的） refGroup 。

Exeption: If refGroup equals Group , Value and refValue are the same. 示例：如果refGroup等于Group ，则Value和refValue相同。

So the result should be: 因此结果应为：

+-----+-------+----------+-------+----------+
| No  | Group | refGroup | Value | refValue |
+-----+-------+----------+-------+----------+
| 123 | A1    | A1       |   5.0 |      5.0 |
| 123 | B1    | A1       |   7.3 |      2.3 |
| 123 | B2    | A1       |   8.9 |      3.9 |
| 123 | B3    | B1       |   7.9 |      0.6 |
| 465 | A1    | A1       |   1.4 |      1.4 |
| 465 | B1    | A1       |   4.5 |      3.1 |
| 465 | B2    | B1       |   7.3 |      2.8 |
+-----+-------+----------+-------+----------+

Explanation for the first two rows: 前两行的说明：

First row: refGroup equals Group -> refValue = Value 第一行： refGroup等于Group - > refValue = Value

Second row: search for the row with the same No (123) and refGroup as Group (A1) and calculate Value of the current row minus Value of the referenced row (7.3 - 5.0 = 2.3). 第二行：搜索具有相同的行No （123）和refGroup作为Group （A1）和计算Value的当前行减去的Value引用的行的（7.3 - 5.0 = 2.3）。

I thought I might need to use groupby() and apply(), but how? 我以为我可能需要使用groupby（）和apply（），但是如何？

Hope my example is detailed enough, if you need any further information, please ask :) 希望我的示例足够详细，如果您需要任何其他信息，请询问:)

Answer 1

One way is to use a database SQL like technique; 一种方法是使用类似数据库SQL的技术。 use 'self-join' with merge . 与merge一起使用'self-join'。 You merge/join a dataframe to itself using left_on and right_on to line up 'Group' with 'refGroup' then subtract the value from each dataframe record: 您可以使用left_on和right_on将数据left_on合并/ left_on到自身，以使“ Group”与“ refGroup”对齐，然后从每个数据框记录中减去该值：

df_out = df.merge(df, 
                  left_on=['No','refGroup'], 
                  right_on=['No','Group'], 
                  suffixes=('','_ref'))

df['refValue'] = np.where(df_out['Group'] == df_out['refGroup'],
                          df_out['value'],
                          df_out['value'] - df_out['value_ref'])

df

Output: 输出：

    No Group refGroup  value  refValue
0  123    A1       A1    5.0       5.0
1  123    B1       A1    7.3       2.3
2  123    B2       A1    8.9       3.9
3  123    B3       B1    7.9       0.6
4  465    A1       A1    1.4       1.4
5  465    B1       A1    4.5       3.1
6  465    B2       B1    7.3       2.8

Answer 2

使用理解列表，您可以执行以下操作：

df['refValue'] = [ row['Value'] - float(df.loc[(df['No']==row['No']) & (df['Group']==row['refGroup']),'Value'].values) if row['refGroup']!=row['Group'] else row['Value'] for index, row in df.iterrows() ]

Python-根据条件将一列添加到包含来自另一行的值的数据框

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-06-14 16:16:02

解决方案2
1 2018-06-14 16:25:30

Python-根据条件将一列添加到包含来自另一行的值的数据框

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-06-14 16:16:02

解决方案2 1 2018-06-14 16:25:30

解决方案1
3 已采纳 2018-06-14 16:16:02

解决方案2
1 2018-06-14 16:25:30