[英]Change Values in pandas dataframe column
I have a dataframe filled with several columns. 我有一个充满几列的数据框。 I need to change the values of a column for data normalization like in the following example:
我需要更改列的值以进行数据标准化,如以下示例所示:
User_id
751730951
751730951
0
163526844
...and so on
I need to replace every value in the column that is not 0 (string) in a into something like "is not empty". 我需要将a中不是0(字符串)的列中的每个值替换为类似“不为空”的值。 I have tried it now for hours but still cannot change every value that is not 0 into something else.
我已经尝试了几个小时,但仍然无法将每个非0的值更改为其他值。 Replace()-function don't work really good for that.
Replace()函数对此并不十分有效。 Some good ideas?
有什么好主意吗?
EDIT (my solution): 编辑(我的解决方案):
finalResult.loc[finalResult['update_user'] == '0', 'update_user'] = 'empty'
finalResult.loc[finalResult['update_user'] != 'empty', 'update_user'] = 'not empty'
df.loc[df['mycolumn'] != '0', 'mycolumn'] = 'not empty'
or if the value is an int, 或者如果值是一个int,
df.loc[df['mycolumn'] != 0, 'mycolumn'] = 'not empty'
df.loc[rows, cols]
allows you to get or set a range of values in your DataFrame. df.loc[rows, cols]
允许您获取或设置DataFrame中的值范围。 First parameter is rows, in which case I'm using a boolean mask to get all rows that don't have a 0 in mycolumn
. 第一个参数是行,在这种情况下,我使用布尔掩码来获取
mycolumn
所有不为0的mycolumn
。 The second parameter is the column you want to get/set. 第二个参数是您要获取/设置的列。 Since I'm replacing the same column I queried from, it is also
mycolumn
. 由于我要替换查询的同一列,因此它也是
mycolumn
。
I then simply using the assignment operator to assign the value of 'not empty' like you wanted. 然后,我简单地使用赋值运算符按需要分配“不为空”的值。
If you want a new column to contain the 'not empty' so you're not contaminating your original data in mycolumn
, you can do: 如果希望新列包含“非空”,以免污染
mycolumn
的原始数据, mycolumn
可以执行以下操作:
df.loc[df['mycolumn'] != 0, 'myNewColumnsName'] = 'not empty'
Simpliest is use: 最简单的用途是:
df['User_id'] = df['User_id'].replace('0', 'is not empty')
If 0
is int
: 如果
0
是int
:
df['User_id'] = df['User_id'].replace(0, 'is not empty')
Suppose we use a Series with the data specified in the question, named user_id, with a single line you do what you need: 假设我们使用一个系列,将问题中指定的数据命名为“ user_id”,并且只需一行即可完成所需的操作:
user_id.where(user_id == 0).fillna('is not empty')
I don't like loc very much since I think it complicates the reading. 我不太喜欢loc,因为我认为它会使阅读变得复杂。
It might be better than replace because it allows the opposite case: 它可能比替换更好,因为它允许相反的情况:
user_id.where(user_id != 0).fillna('is empty')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.