简体   繁体   English

更改熊猫数据框列中的值

[英]Change Values in pandas dataframe column

I have a dataframe filled with several columns. 我有一个充满几列的数据框。 I need to change the values of a column for data normalization like in the following example: 我需要更改列的值以进行数据标准化,如以下示例所示:

User_id   
751730951     
751730951
0
163526844
...and so on

I need to replace every value in the column that is not 0 (string) in a into something like "is not empty". 我需要将a中不是0(字符串)的列中的每个值替换为类似“不为空”的值。 I have tried it now for hours but still cannot change every value that is not 0 into something else. 我已经尝试了几个小时,但仍然无法将每个非0的值更改为其他值。 Replace()-function don't work really good for that. Replace()函数对此并不十分有效。 Some good ideas? 有什么好主意吗?

EDIT (my solution): 编辑(我的解决方案):

finalResult.loc[finalResult['update_user'] == '0', 'update_user'] = 'empty'
finalResult.loc[finalResult['update_user'] != 'empty', 'update_user'] = 'not empty'
df.loc[df['mycolumn'] != '0', 'mycolumn'] = 'not empty'

or if the value is an int, 或者如果值是一个int,

df.loc[df['mycolumn'] != 0, 'mycolumn'] = 'not empty'

df.loc[rows, cols] allows you to get or set a range of values in your DataFrame. df.loc[rows, cols]允许您获取或设置DataFrame中的值范围。 First parameter is rows, in which case I'm using a boolean mask to get all rows that don't have a 0 in mycolumn . 第一个参数是行,在这种情况下,我使用布尔掩码来获取mycolumn所有不为0的mycolumn The second parameter is the column you want to get/set. 第二个参数是您要获取/设置的列。 Since I'm replacing the same column I queried from, it is also mycolumn . 由于我要替换查询的同一列,因此它也是mycolumn

I then simply using the assignment operator to assign the value of 'not empty' like you wanted. 然后,我简单地使用赋值运算符按需要分配“不为空”的值。

New column containing 'not empty' 新列包含“非空”

If you want a new column to contain the 'not empty' so you're not contaminating your original data in mycolumn , you can do: 如果希望新列包含“非空”,以免污染mycolumn的原始数据, mycolumn可以执行以下操作:

df.loc[df['mycolumn'] != 0, 'myNewColumnsName'] = 'not empty'

Simpliest is use: 最简单的用途是:

df['User_id'] = df['User_id'].replace('0', 'is not empty')

If 0 is int : 如果0int

df['User_id'] = df['User_id'].replace(0, 'is not empty')

Suppose we use a Series with the data specified in the question, named user_id, with a single line you do what you need: 假设我们使用一个系列,将问题中指定的数据命名为“ user_id”,并且只需一行即可完成所需的操作:

user_id.where(user_id == 0).fillna('is not empty')

I don't like loc very much since I think it complicates the reading. 我不太喜欢loc,因为我认为它会使阅读变得复杂。

It might be better than replace because it allows the opposite case: 它可能比替换更好,因为它允许相反的情况:

user_id.where(user_id != 0).fillna('is empty')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM