[英]Check each value in one column with each value of other column in one dataframe
I have following dataframe:我有以下数据框:
import pandas as pd
dict = {'val1':["3.2", "2.4", "-2.3", "-4.9","0"],
'class': ["1", "0", "0", "0", "1"],
'val2':["3.2", "2.7", "1.7", "-7.1", "0"]}
df = pd.DataFrame(dict)
df
val1 class val2
0 3.2 1 3.2
1 2.4 0 2.7
2 -2.3 0 1.7
3 -4.9 0 -7.1
4 0.0 1 0.0
I want to check two things: 1) for the sign: if the sign of record in column val1 is not same with the sign of column val2 (for example: sign of the values at index 2 is not same), in this case change the sign of value 2 to the sign of value 1. Desired output is like this:我想检查两件事:1)对于符号:如果列 val1 中记录的符号与列 val2 的符号不同(例如:索引 2 处的值的符号不相同),在这种情况下更改值 2 的符号到值 1 的符号。期望的输出是这样的:
val1 class val2
0 3.2 1 3.2
1 2.4 0 2.7
2 -2.3 0 -1.7
3 -4.9 0 -7.1
4 0.0 1 0.0
2) Second check: if the value in val2 column is within the interval between value in val1 column +2 and -2. 2) 第二次检查:val2 列中的值是否在 val1 列中的值+2 和-2 之间的区间内。 For example: record at index 2: 2.4 is in the range [2.7+2: 2.7-2].例如:在索引 2: 2.4 处的记录在 [2.7+2: 2.7-2] 范围内。 If condition is true then i want to change class from 0 to 1. Desired output is :如果条件为真,那么我想将类从 0 更改为 1。所需的输出是:
val1 class val2
0 3.2 1 3.2
1 2.4 1 2.7
2 -2.3 1 -1.7
3 -4.9 0 -7.1
4 0.0 1 0.0
First convert values to floats if necessary and then set sign with numpy.sign
and then for second use Series.between
:如有必要,首先将值转换为浮点数,然后使用numpy.sign
设置符号,然后第二次使用Series.between
:
df['val1'] = df['val1'].astype(float)
df['val2'] = df['val2'].astype(float)
df['val2'] *= np.sign(df['val1']) * np.sign(df['val2'])
df['class'] = df['val2'].between(df['val1'] - 2, df['val1'] + 2).astype(int)
#alternative
#df['class'] = np.where(df['val2'].between(df['val1'] - 2, df['val1'] + 2), 1, 0)
print (df)
val1 class val2
0 3.2 1 3.2
1 2.4 1 2.7
2 -2.3 1 -1.7
3 -4.9 0 -7.1
4 0.0 1 0.0
Try this:尝试这个:
import numpy as np
# Check 1
df['val2'] = df.apply(lambda x: np.sign(x['val1']) * np.sign(x['val2']) * x['val2'], axis=1)
# Check 2
df['class'] = df.apply(lambda x: int(abs(x['val1'] - x['val2']) < 2) , axis=1)
I think this will solve your query without using any other library:我认为这将在不使用任何其他库的情况下解决您的查询:
def signfunc(x,y):
if x*y >= 0:
return y
else:
return -1*y
df['val1'] = df['val1'].astype(float)
df['val2'] = df['val2'].astype(float)
df['val2'] = df.apply(lambda x: signfunc(x.val1, x.val2), axis=1)
print(df)
df.loc[abs(df["val1"]-df["val2"])<=2, 'class'] = 1
print(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.