[英]Match object in list of tuples to object in dataframe, create new column if match exists
Consider the following list of tuples: 考虑以下元组列表:
some_strings = [('Name1', 'ABCD', 'DEFG', 'Score=12'),
('Name2', 'JKLL', 'RMPQ', 'Score=11')]
And the following pandas dataframe: 以及以下熊猫数据框:
Sequence ID Left Sequence Right Sequence
Name1 ABCD RQLM
Name1 ABCR PLMT
Name2 JKLL ZFGQ
Name2 RPLP FTRD
I am trying to compare the second object in the tuple to the column df['Left Sequence'] to check for an exact match (not concerned with partial matches), and if the match occurs, print dimer in a new column at the end of the df. 我试图将元组中的第二个对象与df ['Left Sequence']列进行比较,以检查是否完全匹配(不考虑部分匹配),如果匹配发生,则在末尾的新列中打印二聚体df。 If a match does not occur, I will print NA. 如果没有发生匹配,我将打印NA。 Here is the code I have tried: 这是我尝试过的代码:
for x in some_strings:
for y in x:
df['Dimers'] = df['Left Sequence'].apply(lambda s: 'Dimer' if s == y[1] else 'NA')
My expected output: 我的预期输出:
Sequence ID Left Sequence Right Sequence Dimers
Name1 ABCD RQLM Dimer
Name1 ABCR PLMT NA
Name2 JKLL ZFGQ Dimer
Name2 RPLP FTRD NA
My actual output (you can probably guess this): 我的实际输出(您可能会猜到):
Sequence ID Left Sequence Right Sequence Dimers
Name1 ABCD RQLM NA
Name1 ABCR PLMT NA
Name2 JKLL ZFGQ NA
Name2 RPLP FTRD NA
Any suggestions would be great. 任何建议都很好。
Create mask of Boolean : we using isin
to create the mask 创建布尔型的掩码:我们使用isin
创建掩码
mask=df.iloc[:,:-1].apply(tuple,1).isin([x[:-2] for x in some_strings])
df['Dimer']='NA'
df.loc[mask,'Dimer']='Dimer'
df
Out[1120]:
SequenceID LeftSequence RightSequence Dimer
0 Name1 ABCD RQLM Dimer
1 Name1 ABCR PLMT NA
2 Name2 JKLL ZFGQ Dimer
3 Name2 RPLP FTRD NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.