I have a dataset with three columns A, B and C. I want to create a column where I select the two columns closest to each other and take the average. Take the table below as an example:
A B C Best of Three
3 2 5 2.5
4 3 1 3.5
1 5 2 1.5
For the first row, A and B are the closest pair, so the best of three column is (3+2)/2 = 2.5; for the third row, A and C are the closest pair, so the best of three column is (1+2)/2 = 1.5. Below is my code. It is quite unwieldy and quickly become too long if there are more columns. Look forward to suggestions!
data = {'A':[3,4,1],
'B':[2,3,5],
'C':[5,1,2]}
df = pd.DataFrame(data)
df['D'] = abs(df['A'] - df['B'])
df['E'] = abs(df['A'] - df['C'])
df['F'] = abs(df['C'] - df['B'])
df['G'] = min(df['D'], df['E'], df['F'])
if df['G'] = df['D']:
df['Best of Three'] = (df['A'] + df['B'])/2
elif df['G'] = df['E']:
df['Best of Three'] = (df['A'] + df['C'])/2
else:
df['Best of Three'] = (df['B'] + df['C'])/2
First you need a method that finds the minimum diff between 2 elements in a list, the method also returns the median with the 2 values, this is returned as a tuple (diff, median)
def min_list(values):
return min((abs(x - y), (x + y) / 2)
for i, x in enumerate(values)
for y in values[i + 1:])
Then apply it in each row
df = pd.DataFrame([[3, 2, 5, 6], [4, 3, 1, 10], [1, 5, 10, 20]],
columns=['A', 'B', 'C', 'D'])
df['best'] = df.apply(lambda x: min_list(x)[1], axis=1)
print(df)
Functions are your friends. You want to write a function that finds the two closest integers of an list, then pass it the list of the values of the row. Store those results and pass them to a second function that returns the average of two values.
(Also, your code would be much more readable if you replaced D
, E
, F
, and G
with descriptively named variables.)
Solve by using itertools combinations generator:
def get_closest_avg(s):
c = list(itertools.combinations(s, 2))
return sum(c[pd.Series(c).apply(lambda x: abs(x[0]-x[1])).idxmin()])/2
df['B3'] = df.apply(get_closest_avg, axis=1)
df:
A B C B3
0 3 2 5 2.5
1 4 3 1 3.5
2 1 5 2 1.5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.