简体   繁体   中英

Python - How to change groups of values in one column of pandas dataframe depending on a value in another column?

I have been searching everywhere on stack for this question and answer but I can't seem to find it anywhere.

I have a pandas dataframe which looks like the example below:

product purchase price
credit '
toy cash ' £20
electronics cash ' £50
groceries cash ' £80
gaming cash ' £30
cash '
toy credit ' £20
electronics credit ' £50
groceries credit ' £80
gaming credit ' £30
transfer '
toy cash ' £20
electronics cash ' £50
groceries cash ' £80
gaming cash ' £30

So I tried to make a dataframe above to show you what I mean. Essentially in the dataframe above I want the values in the 2nd column to be replaced by the 1st corresponding value at the top of each group.

Edit: To make it easier to understand I have added symbols for the values I want to change. So in the first instance, credit is taken from the product column and this replaces cash for each value in the purchase column until it gets to Cash in the Product column which would then change values in the purchase column for the next 4 items from credit to cash and then it would do the same for transfer.

So for the first group the value is credit but the second column for the first group of items is “cash” can I create a function which takes the first value at the top of the group and assigns all the cash values for only that group to credit based on the first item in that group?

And then the same for the second group where the first item is cash, I want it to take in the first item of the second group and replace all the credit values for that group to the first item of the second group which in this case would be cash.

And so on down the list?

Apologies if this is not very clear but if anyone can help solve this I will be extremely grateful. :)

What I would like to see in the output: :)

product purchase price
credit
toy credit £20
electronics credit £50
groceries credit £80
gaming credit £30
cash
toy cash £20
electronics cash £50
groceries cash £80
gaming cash £30
transfer
toy transfer £20
electronics transfer £50
groceries transfer £80
gaming transfer £30

Thank Youuuuu

You can iterate through the rows using the iterrows() method. From there, you can check if a row's columns are empty and save your group name. When you find full rows below, you can write the group name into the appropriate slot.

temp = None
for i, row in df.iterrows():
    if str(row['purchase']) == 'nan':
        if str(row['product']) != 'nan':
            temp = row['product']
            print(temp)
    elif temp:
        df.iloc[i]['purchase'] = temp

I believe this code should get you what you want.

Basically you want to use a group by function to group rows based on values, and use the head function to return the top value of "price" for each group.

df = df.groupby(["product", "purchase"]).head(1)
temp = None
for i, row in df.iterrows():
if str(row['purchase']) == 'nan':
if str(row['product']) != 'nan':
temp = row['product']
print(temp)
elif temp:
df.iloc[i]['purchase'] = temp

This method is working, please check your field name "purchase" not "Purchase". Check your Capital Letter Case

Thanks Leon

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM