简体   繁体   中英

Python add / merge rows of a dataframe together based on multiple conditions

Good afternoon, I hope that you are well.

I have an xlsx file in the following format that which is output from a Python function I have been using to parse data:-

在此处输入图像描述

I have loaded this xlsx file into a pandas df in an attempt to achieve the following output:-

在此处输入图像描述

The requirements that I am trying to satisfy are:- For each row in the dataframe, if the the "Application ID" and "Test Phase" column values match, then I would like to add the row values for those column values together and replace the original matched rows with one row containing the summed values.

Where there is match in the column values, the original row should remain in place.

If there any pointers on how to achieve this, it would be much appreciated. I have attempted to achieve this code in the function prior to writing the values to the source xlsx output file however I assumed it would be easier to achieve by working with pandas / numpy.

Many thanks in advance Jimmy

Use groupby_sum :

out = df.groupby(['Application ID', 'Test Phase'], as_index=False).sum()
print(out)

# Output
   Application ID Test Phase  Total Tests   A
0               9        SIT           36  36
1              11        UAT            5   5

Setup:

data = {'Application ID': [9, 9, 11],
        'Test Phase': ['SIT', 'SIT', 'UAT'],
        'Total Tests': [9, 27, 5],
        'A': [9, 27, 5]}
df = pd.DataFrame(data)
print(df)

# Output
   Application ID Test Phase  Total Tests   A
0               9        SIT            9   9
1               9        SIT           27  27
2              11        UAT            5   5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM