I have a df1 like below and i want check if all the values of certain column in df2 are between df1 max and min value. If it is i want to give value from name column of that index. If df2 value is not in between any of those, i want to see if it is bigger or smaller than any of df1 max or min value.
data = {'Name': ['MN1', 'MN2', 'MN3', 'MN4', 'MN5', 'MN6', 'MN7-8', 'MN9', 'MN10', 'MN11', 'MN12', 'MN13', 'MN14', 'MN15', 'MN16','MN17', 'MQ18', 'MQ19'],
'MAX': [23, 21.7, 19.5, 17.2, 16.4, 14.2, 12.85, 11.2, 9.9, 8.9, 7.6, 7.1, 5.3, 5, 3.55, 2.5, 1.9, 0.85],
'MIN':[21.7, 19.5, 17.2, 16.4, 14.2, 12.85, 11.2, 9.9, 8.9, 7.6, 7.1, 5.3, 5, 3.55, 2.5, 1.9, 0.85, 0.01]
}
df1 = pd.DataFrame (data, columns = ['Name','MAX','MIN'])
I tried this:
list = []
for i in df2['AVERAGE_AGE']:
for index, row in df1.iterrows():
if row['MAX'] >= i and row['MIN'] < i:
list.append(row['Name'])
if i > df1['MAX'].max():
list.append("Postmn")
elif i < df1['MIN'].min():
list.append("Premn")
df2['MNname'] = list
this takes long time and list length doesn't match with length of df2
You can try this
(df2['AVERAGE_AGE'] < df1['MIN'].min()).value_counts()
(df2['AVERAGE_AGE'] > df1['MAX'].max()).value_counts()
This will tell you the number of rows that satisfy the conditions by giving the counts of True and False.
You can loop over the first dataframe and set Names for the second using pandas.DataFrame.loc :
>>> df2 = pd.DataFrame([
... 2.299367, 20.688943, 10.245027, 1.412258, 22.541987,
... 2.588420, 5.578598, 11.703629, 12.529066, 17.769196,
... ], columns=['AVERAGE_AGE'])
>>> for index, row in df1.iterrows():
... df2.loc[(df2.AVERAGE_AGE>=row.MIN) & (df2.AVERAGE_AGE<row.MAX),'Name'] = row.Name
...
>>> df2
AVERAGE_AGE Name
0 2.299367 MN17
1 20.688943 MN2
2 10.245027 MN9
3 1.412258 MQ18
4 22.541987 MN1
5 2.588420 MN16
6 5.578598 MN13
7 11.703629 MN7-8
8 12.529066 MN7-8
9 17.769196 MN3
Try this:
arr = []
for i in range(df2.shape[0]):
# Check if the value in COLUMN_1 is between MIN and MAX value
if ((df2['COLUMN_1'][i] > df1['MIN'][i]) and df2['COLUMN_1'][i] < df1['MAX'][i]):
arr.append(df1['Name'][i])
# Check if value in COLUMN_1 is less than Minimum value
elif (df2['COLUMN_1'][i] < df1['MIN'][i]):
arr.append(np.round(df2['COLUMN_1'][i] - df1['MIN'][i], 2))
# Check if value in COLUMN_1 is less than Minimum value
elif (df2['COLUMN_1'][i] > df1['MAX'][i]):
arr.append(np.round(df2['COLUMN_1'][i] - df1['MAX'][i], 2))
df2['Name'] = pd.Series(arr)
As you have not mentioned exactly the name of column to be checked in df2, I have used it as COLUMN_1. The conditions and values used are:
Hope this works!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.