简体   繁体   English

如何在 Python dataframe 中同时替换多行?

[英]How can I replace multiple rows simultaneously in a Python dataframe?

I have a dataset with the following unique values in one of its columns.我在其中一列中有一个具有以下唯一值的数据集。

   df['Gender'].unique()

   array(['Female', 'M', 'Male', 'male', 'm', 'Male-ish', 'maile',
   'Trans-female', 'Cis Female', 'something kinda male?', 'Cis Male',
   'queer/she/they', 'non-binary', 'Make', 'Nah', 'All', 'Enby',
   'fluid', 'Genderqueer', 'Androgyne', 'Agender', 'Guy (-ish) ^_^',
   'male leaning androgynous', 'Male ', 'Man', 'msle', 'Neuter',
   'queer', 'A little about you', 'Malr',
   'ostensibly male, unsure what that really means')]

As you can see, there are obvious cases where a row should be listed as 'Male' (I'm referring to the cases where 'Male' is misspelled, of course).如您所见,在某些情况下,一行应列为“男性”(当然,我指的是“男性”拼写错误的情况)。 How can I replace these values with 'Male' without calling the replace function ten times?如何在不调用替换 function 十次的情况下将这些值替换为“男性”? This is the code I have tried:这是我尝试过的代码:

x=0
while x<=11:
for i in df['Gender']:
    if i[0:2]=='Ma':
        print('Male')
    elif i[0]=='m':
        print('Male')
x+=1

However, I just get a print of a bunch of "Male".然而,我只是得到一堆“男性”的打印。

Edit: I want to convert the following values to 'Male': 'M', 'male', 'm', 'maile', 'Make', 'Man', 'msle', 'Malr', 'Male '编辑:我想将以下值转换为 'Male':'M'、'male'、'm'、'maile'、'Make'、'Man'、'msle'、'Malr'、'Male'

Create a list with all the nicknames of Male:创建一个包含 Male 的所有昵称的列表:

males_list = ['M', 'male', 'm', 'maile', 'Make', 'Man', 'msle', 'Malr', 'Male ']

And then replace them with:然后将它们替换为:

df.loc[df['Gender'].isin(males_list), 'Gender'] = 'Male'

btw: There is always a better solution than looping the rows in pandas , not just in cases like this.顺便说一句:总有比循环pandas中的行更好的解决方案,而不仅仅是在这种情况下。

I would use the map function as it allows you to create any custom logic.我会使用map function 因为它允许您创建任何自定义逻辑。 So for instance, by looking at your code, something like this would do the trick:因此,例如,通过查看您的代码,这样的事情就可以解决问题:

def correct_gender(text):

    if text[0:2]=='Ma' or text[0]=='m':
        return "Male"

    return text

df["Gender"] = df["Gender"].map(correct_gender)

If I understand you correctly, you want a more generalized approach.如果我对您的理解正确,您需要一种更通用的方法。 We can use regex to check if the word starts with M or has the letters Ma preceded by a whitespace, so we dont catch Female :我们可以使用正则表达式来检查单词是否以M开头或字母Ma前面有一个空格,所以我们不捕获Female

  • (?i) : stands for ignore case sensitivity (?i) : 代表忽略大小写敏感
  • ?<=\s : means all the words which start with ma and are preceded by a whitespace ?<=\s :表示所有以ma开头并以空格开头的单词
df.loc[df['Gender'].str.contains('(?i)^M|(?<=\s)ma'), 'Gender'] = 'Male'

Output Output

                Gender
0               Female
1                 Male
2                 Male
3                 Male
4                 Male
5                 Male
6                 Male
7         Trans-female
8           Cis Female
9                 Male
10                Male
11      queer/she/they
12          non-binary
13                Male
14                 Nah
15                 All
16                Enby
17               fluid
18         Genderqueer
19           Androgyne
20             Agender
21      Guy (-ish) ^_^
22                Male
23                Male
24                Male
25                Male
26              Neuter
27               queer
28  A little about you
29                Male
30                Male

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何用我存储的多行替换 dataframe 中的某一行,使用特定的列进行匹配? - How can I replace a certain row in a dataframe with multiple rows that i have stored, using a particular column to match? 如何在python中使用opencv同时播放多个视频? - how can I play multiple videos simultaneously using opencv in python? 在python中如何将列表的多个值同时设置为零? - In python how can I set multiple values of a list to zero simultaneously? pandas-如何替换 dataframe 中的行 - pandas-how can I replace rows in a dataframe 如何在多列上同时按条件列表过滤 DataFrame 行 - How to filter DataFrame rows by list of conditions simultaneously on multiple columns 我可以同时在Python中运行多个计时器吗? - Can I run multiple Timers in Python simultaneously? 如何使用应用程序从 python function 将多行返回到 pandas Z6A8064B5DF479455500553C7? - How can I return multiple rows from a python function to a pandas dataframe using apply? 如何在 python 中进行多次替换? - How can I do the multiple replace in python? Python -pandas:如何同时选择大数据帧的所有偶数列 - Python -pandas: How can I simultaneously select all even columns of a big Dataframe 如何在Python中将数据框中的行转换为多个列表 - how do I convert rows in a dataframe into multiple lists in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM