熊猫集团全部申请

Question

I've got an involved situation. 我有一个参与的情况。 Let's say I have the following example dataframe of loans: 假设我有以下示例贷款数据框：

test_df = pd.DataFrame({'name': ['Jack','Jill','John','Jack','Jill'],
                   'date': ['2016-08-08','2016-08-08','2016-08-07','2016-08-08','2016-08-08'],
                   'amount': [1000.0,1500.0,2000.0,2000.0,3000.0],
                   'return_amount': [5000.0,2000.0,3000.0,0.0,0.0],
                   'return_date': ['2017-08-08','2017-08-08','2017-08-07','','2017-08-08']})

test_df.head()

    amount  date        name    return_amount   return_date
0   1000.0  2016-08-08  Jack    5000.0          2017-08-08
1   1500.0  2016-08-08  Jill    2000.0          2017-08-08
2   2000.0  2016-08-07  John    3000.0          2017-08-07
3   2500.0  2016-08-08  Jack    0.0
4   2500.0  2016-08-08  Jill    0.0             2017-08-08

There are a few operations I need to perform after grouping this dataframe by name (grouping loans by person): 按名称对数据帧进行分组后，我需要执行一些操作（按人员分组贷款）：

1) return amount needs to allocated proportionally by the sum of amount . 1） return amount需要通过的总和成比例地分配amount 。

2) If return date is missing for ANY loan for a given person, then all return_dates should be converted to empty strings ''. 2）如果给定人员的任何贷款缺少return date ，则所有return_dates应转换为空字符串''。

I already have a function that I use to allocate the proportional return amount: 我已经有了一个用来分配比例回报金额的函数：

def allocate_return_amount(group):
    loan_amount = group['amount']
    return_amount = group['return_amount']
    sum_amount = loan_amount.sum()
    sum_return_amount = return_amount.sum()
    group['allocated_return_amount'] = (loan_amount/sum_amount) * sum_return_amount
    return group

And I use grouped_test_df = grouped_test_df.apply(allocate_return_amount) to apply it. 我使用grouped_test_df = grouped_test_df.apply(allocate_return_amount)来应用它。

What I am struggling with is the second operation I need to perform, checking if any of the loans to a person are missing a return_date , and if so, changing all return_dates for that person to ''. 我正在努力的是我需要执行的第二个操作，检查一个人的任何贷款是否缺少return_date ，如果是，则将该人的所有return_dates更改为''。

I've found GroupBy.all in the pandas documentation , but I haven't figured out how to use it yet, anyone with experience with this? 我在pandas文档中找到了GroupBy.all，但我还没有弄清楚如何使用它，任何有此经验的人？

Since this example might be a bit hard to follow, here's my ideal output for this example: 由于此示例可能有点难以理解，因此这是此示例的理想输出：

ideal_test_df.head()

    amount  date        name    return_amount   return_date
0   1000.0  2016-08-08  Jack    0.0             ''
1   1500.0  2016-08-08  Jill    666.66          2017-08-08
2   2000.0  2016-08-07  John    3000.0          2017-08-07
3   2500.0  2016-08-08  Jack    0.0             ''
4   2500.0  2016-08-08  Jill    1333.33         2017-08-08

Hopefully this makes sense, and thank you in advance to any pandas expert who takes the time to help me out! 希望这是有道理的，并提前感谢任何花时间帮助我的熊猫专家！

Answer 1

You can do it by iterating through the groups, testing the condition using any , then setting back to the original dataframe using loc : 您可以通过遍历组，使用any测试条件，然后使用loc设置回原始数据框来完成此操作：

test_df = pd.DataFrame({'name': ['Jack','Jill','John','Jack','Jill'],
                   'date': ['2016-08-08','2016-08-08','2016-08-07','2016-08-08','2016-08-08'],
                   'amount': [1000.0,1500.0,2000.0,2000.0,3000.0],
                   'return_amount': [5000.0,2000.0,3000.0,0.0,0.0],
                   'return_date': ['2017-08-08','2017-08-08','2017-08-07','','2017-08-08']})

grouped = test_df.groupby('name')

for name, group in grouped:
    if any(group['return_date'] == ''):
        test_df.loc[group.index,'return_date'] = ''

And if you want to reset return_amount also, and don't mind the additional overhead, just add this line right after: 如果你想重置return_amount ，并且不介意额外的开销，只需在此之后添加以下行：

test_df.loc[group.index, 'return_amount'] = 0

熊猫集团全部申请

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-08-17 17:32:38

熊猫集团全部申请

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-08-17 17:32:38

解决方案1
2 已采纳 2016-08-17 17:32:38