Merging rows when some columns are the same using Pandas Python

Question

Now I have a dataframe, I want to merge rows. The value B is determined by the order in the strings in a list L = ['xx','yy','zz']

    A   B
0   a   xx
1   a   yy
2   b   zz
3   b   yy

For row 0 and 1, the result will be 'a' for column A and 'xx' for column B ('xx' come before 'yy' in L)
For row 2 and 3, the result will be 'b' for column A and 'yy' for column B ('yy' come before 'zz' in L)

Desired outcome:

    A   B
0   a   xx
1   b   yy

Answer 1

df['C'] = df['B'].map(dict(zip(L,range(len(L)))))
df.groupby('A')[['B','C']].apply(lambda x: x.iloc[x["C"].argmin()]['B'])
#A
#a    xx
#b    yy

You can get the same result using pandas.Categorical :

df['B'] = pd.Categorical(df['B'], categories = L, ordered = True)
df.groupby('A').min()
#      B
#A
#a    xx
#b    yy