简体   繁体   English

Python/Pandas:如何根据个人 ID 替换 Pandas 数据框的特定值?

[英]Python/Pandas: How do I replace specific values of a Pandas Data Frame based on individual id?

I have a long Pandas dataset that contains a column called 'id' and another column called 'species' , among other columns.我有一个很长的 Pandas 数据集,其中包含一个名为'id'列和另一个名为'species'列,以及其他列。 I have to perform a change on the 'species' column, based on specific values of the 'id' column.我必须根据'id'列的特定值对'species'列进行更改。

For example, if the 'id' is '5555555' (as a string), then I want that the 'species' value change its current value 'dove' (also a string) to 'hummingbird' .例如,如果'id''5555555' (作为字符串),那么我希望'species'值将其当前值'dove' (也是一个字符串)更改为'hummingbird' So far I have been using the method:到目前为止,我一直在使用该方法:

df.loc[df["id"] == '5555555', "species"] = 'hummingbird'

Here is short sample data frame:这是简短的示例数据框:

import pandas as pd
        
#Starting dataset
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'dove', 'dove', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'dove']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    dove        #wants to replace
2   33333333    dove        #wants to replace
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    dove        #wants to replace        
     
#Expected outcome
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'hummingbird', 'hummingbird', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'hummingbird']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    hummingbird #replaced
2   33333333    hummingbird #replaced
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    hummingbird #replaced

This is ok for a small number of lines, but I have to do this to about 1000 lines with individual 'id' each, so I thought that maybe a loop that I could feed it the list of 'id' , but I honestly do not know how to even start.这对于少量行来说是可以的,但是我必须对大约 1000 行执行此操作,每个行都有单独的'id' ,所以我认为这可能是一个循环,我可以将'id'列表提供给它,但我老实说甚至不知道如何开始。

Thanks in advance!!提前致谢!!

and thanks to Scott Boston for pointing me out in the right direction to ask better questions!并感谢 Scott Boston 为我指出正确的方向以提出更好的问题!

Use isin使用isin

humming_ids = [44444444, 5555555, 88888888]
df.loc[df.id.isin(humming_ids), "species"] = 'hummingbird'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM