Consider the following list of tuples:
some_strings = [('Name1', 'ABCD', 'DEFG', 'Score=12'),
('Name2', 'JKLL', 'RMPQ', 'Score=11')]
And the following pandas dataframe:
Sequence ID Left Sequence Right Sequence
Name1 ABCD RQLM
Name1 ABCR PLMT
Name2 JKLL ZFGQ
Name2 RPLP FTRD
I am trying to compare the second object in the tuple to the column df['Left Sequence'] to check for an exact match (not concerned with partial matches), and if the match occurs, print dimer in a new column at the end of the df. If a match does not occur, I will print NA. Here is the code I have tried:
for x in some_strings:
for y in x:
df['Dimers'] = df['Left Sequence'].apply(lambda s: 'Dimer' if s == y[1] else 'NA')
My expected output:
Sequence ID Left Sequence Right Sequence Dimers
Name1 ABCD RQLM Dimer
Name1 ABCR PLMT NA
Name2 JKLL ZFGQ Dimer
Name2 RPLP FTRD NA
My actual output (you can probably guess this):
Sequence ID Left Sequence Right Sequence Dimers
Name1 ABCD RQLM NA
Name1 ABCR PLMT NA
Name2 JKLL ZFGQ NA
Name2 RPLP FTRD NA
Any suggestions would be great.
Create mask of Boolean : we using isin
to create the mask
mask=df.iloc[:,:-1].apply(tuple,1).isin([x[:-2] for x in some_strings])
df['Dimer']='NA'
df.loc[mask,'Dimer']='Dimer'
df
Out[1120]:
SequenceID LeftSequence RightSequence Dimer
0 Name1 ABCD RQLM Dimer
1 Name1 ABCR PLMT NA
2 Name2 JKLL ZFGQ Dimer
3 Name2 RPLP FTRD NA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.