I have a dataframe which looks like this:
X Y Corr_Value
0 51182 51389 1.00
1 51182 50014 NaN
2 51182 50001 0.85
3 51182 50014 NaN
I want to create a new column which consists of the values of X
and Y
columns. The idea is to loop through the rows, if the Corr_Value
is not null , then the new column should show:
Solving (X column value) will solve (Y column value) at (Corr_value column)% probability.
for eg, for the first row the result should be:
Solving 51182 will solve 51389 with 100% probability.
This is the code I wrote:
dfs = []
for i in df1.iterrows():
if ([df1['Corr_Value']] != np.nan):
a = df1['X']
b = df1['Y']
c = df1['Corr_Value']*100
df1['Remarks'] = (f'Solving {a} will solve {b} at {c}% probability')
dfs.append(df1)
df1
is the dataframe which stores the X
, Y
and Corr_Value
data.
But there seems to be a problem because the result I get looks like this:
But the result should look like this:
If you could help me get the desired result, that would be great.
Use DataFrame.dropna
for remove missing rows and apply f-string
s for custom output string with DataFrame.apply
:
f = lambda x: f'Solving {int(x["X"])} will solve {int(x["Y"])} at {int(x["Corr_Value"] * 100)}% probability.'
df['Remarks'] = df.dropna(subset=['Corr_Value']).apply(f,axis=1)
print (df)
X Y Corr_Value Remarks
0 51182 51389 1.00 Solving 51182 will solve 51389 at 100% probabi...
1 51182 50014 NaN NaN
2 51182 50001 0.85 Solving 51182 will solve 50001 at 85% probabil...
3 51182 50014 NaN NaN
You can also use numpy where:
import numpy as np
df['Remarks'] = np.where(df.Corr_Value.notnull(), 'Solving ' + df['X'].astype(str) + ' will solve ' + df['Y'].astype(str) + ' with ' + (df['Corr_Value'] * 100).astype(str) + '% probability', df['Corr_Value'])
Output:
X Y Corr_Value Remarks
0 51182 51389 1.00 Solving 51182 will solve 51389 with 100.0% pro...
1 51182 50014 NaN NaN
2 51182 50001 0.85 Solving 51182 will solve 50001 with 85.0% prob...
3 51182 50014 NaN NaN
Just try:
dfs = []
for i, r in df1.iterrows():
if (r['Corr_Value'] != np.nan):
a = r['X']
b = r['Y']
c = r['Corr_Value']*100
df1.at[i, 'Remarks'] = "Solving "+ str(a) + " will solve " + str(b) + " at " + str(c) + " % probability"
I think the problem is related to using df1
instead of the current row.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.