简体   繁体   中英

Python/Pandas: How do I replace specific values of a Pandas Data Frame based on individual id?

I have a long Pandas dataset that contains a column called 'id' and another column called 'species' , among other columns. I have to perform a change on the 'species' column, based on specific values of the 'id' column.

For example, if the 'id' is '5555555' (as a string), then I want that the 'species' value change its current value 'dove' (also a string) to 'hummingbird' . So far I have been using the method:

df.loc[df["id"] == '5555555', "species"] = 'hummingbird'

Here is short sample data frame:

import pandas as pd
        
#Starting dataset
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'dove', 'dove', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'dove']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    dove        #wants to replace
2   33333333    dove        #wants to replace
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    dove        #wants to replace        
     
#Expected outcome
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'hummingbird', 'hummingbird', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'hummingbird']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    hummingbird #replaced
2   33333333    hummingbird #replaced
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    hummingbird #replaced

This is ok for a small number of lines, but I have to do this to about 1000 lines with individual 'id' each, so I thought that maybe a loop that I could feed it the list of 'id' , but I honestly do not know how to even start.

Thanks in advance!!

and thanks to Scott Boston for pointing me out in the right direction to ask better questions!

Use isin

humming_ids = [44444444, 5555555, 88888888]
df.loc[df.id.isin(humming_ids), "species"] = 'hummingbird'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM