I primarily used MATLAB all through college as a math major and my programming was just building math equations and modeling. Now I have been learning to use Python and in particular, pandas. I am trying to search for values in a column of one dataframe and match them with a value in a column of a different dataframe. If they do match, I want them to give a label to the original dataframe.
For example, I have my first column of employees and I want figure out whether aliceB is Busy or Non-Busy in my first dataframe and label as such in col3.
df1 = {"col1":["aliceA", "aliceB", "aliceC"], "col2":["CO", "WA", "PA"]}
df1 = pd.DataFrame(df1)
df1['col3'] = np.nan
In[]df1
Out[]:
col1 col2 col3
0 aliceA CO NaN
1 aliceB WA NaN
2 aliceC PA NaN
df2 = {'col1': ["aliceB", "aliceA", "aliceC", "bobC", "bobB", "bobA",], 'col2': ['Busy','Non-Busy','Busy','Non-Busy','Non-Busy','Busy']}
df2 = pd.DataFrame(df2)
In[]df2
Out[]:
col1 col2
0 aliceB Busy
1 aliceA Non-Busy
2 aliceC Busy
3 bobC Non-Busy
4 bobB Non-Busy
5 bobA Busy
***Preferred Output***
Out[]:
col1 col2 col3
0 aliceA CO Non-Busy
1 aliceB WA Busy
2 aliceC PA Busy
For this kind of problem MATLAB I would take my two matrices and iterate through using nested for loops to find the value. In Python I made:
for i in range(0, df2.shape[0]):
for j in range(0, df1.shape[0]):
if(df2.col1[i] == df1.col1[j]):
df1.col3[j] = df2.col2[i]
But I get this warning and I have to Control + C to get out of it to continue:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
df1
Out[]:
col1 col2 col3
0 aliceA CO Non-Busy
1 aliceB WA Busy
2 aliceC PA Busy
Technically this code works and my data is filled in, but I know this is probably a poor way to solve my problem. For this small example it doesn't force me to Control+C, but it does when my df1 is thousands of rows long.
Simple map
df1.col3=df1.col1.map(df2.set_index('col1').col2)
df1
Out[31]:
col1 col2 col3
0 aliceA CO Non-Busy
1 aliceB WA Busy
2 aliceC PA Busy
Using merge
:
df1.merge(df2.rename(columns={'col2': 'col3'}), on='col1')
col1 col2 col3
0 aliceA CO Non-Busy
1 aliceB WA Busy
2 aliceC PA Busy
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.