简体   繁体   English

将元组列表中的对象与数据帧中的对象进行匹配,如果存在匹配项,则创建新列

[英]Match object in list of tuples to object in dataframe, create new column if match exists

Consider the following list of tuples: 考虑以下元组列表:

some_strings = [('Name1', 'ABCD', 'DEFG', 'Score=12'),
                ('Name2', 'JKLL', 'RMPQ', 'Score=11')]

And the following pandas dataframe: 以及以下熊猫数据框:

Sequence ID    Left Sequence    Right Sequence
Name1              ABCD             RQLM
Name1              ABCR             PLMT
Name2              JKLL             ZFGQ
Name2              RPLP             FTRD

I am trying to compare the second object in the tuple to the column df['Left Sequence'] to check for an exact match (not concerned with partial matches), and if the match occurs, print dimer in a new column at the end of the df. 我试图将元组中的第二个对象与df ['Left Sequence']列进行比较,以检查是否完全匹配(不考虑部分匹配),如果匹配发生,则在末尾的新列中打印二聚体df。 If a match does not occur, I will print NA. 如果没有发生匹配,我将打印NA。 Here is the code I have tried: 这是我尝试过的代码:

for x in some_strings:
    for y in x:
        df['Dimers'] = df['Left Sequence'].apply(lambda s: 'Dimer' if s == y[1] else 'NA')

My expected output: 我的预期输出:

Sequence ID    Left Sequence    Right Sequence    Dimers
Name1              ABCD             RQLM          Dimer
Name1              ABCR             PLMT           NA
Name2              JKLL             ZFGQ          Dimer
Name2              RPLP             FTRD           NA

My actual output (you can probably guess this): 我的实际输出(您可能会猜到):

Sequence ID    Left Sequence    Right Sequence    Dimers
Name1              ABCD             RQLM           NA
Name1              ABCR             PLMT           NA
Name2              JKLL             ZFGQ           NA
Name2              RPLP             FTRD           NA

Any suggestions would be great. 任何建议都很好。

Create mask of Boolean : we using isin to create the mask 创建布尔型的掩码:我们使用isin创建掩码

mask=df.iloc[:,:-1].apply(tuple,1).isin([x[:-2] for x in some_strings])    
df['Dimer']='NA'
df.loc[mask,'Dimer']='Dimer'        
df
Out[1120]: 
  SequenceID LeftSequence RightSequence  Dimer
0      Name1         ABCD          RQLM  Dimer
1      Name1         ABCR          PLMT     NA
2      Name2         JKLL          ZFGQ  Dimer
3      Name2         RPLP          FTRD     NA

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 遍历 Pandas 数据框中的行并匹配列表中的元组并创建一个新的 df 列 - Iterate through rows in pandas dataframe and match tuples from a list and create a new df column 匹配 pyspark dataframe 列以列出并创建新列 - Match pyspark dataframe column to list and create a new column 当列值与列表中的元组匹配时,删除 Pandas Dataframe 中的行 - Deleting rows in Pandas Dataframe, when column values match tuples in a list Python:将字符串列表与元组列表进行比较,并根据匹配或不匹配创建新列表 - Python: Compare list of strings to list of tuples, and create a new list based on match or no match 使用其他数据框中的匹配值在数据框中创建新列 - Create new column in dataframe with match values from other dataframe 使用两个数据框如何比较查找值作为 substring 在另一个 dataframe 的列中创建一个新列,如果匹配存在 - Using two dataframes how can I compare a lookup value as a substring in the column in another dataframe to create a new column if the match exists 将列表与 DataFrame 中的列进行比较。 如果它们匹配,则 append 到新列 - Compare a list to a column in a DataFrame. If they match then append to a new column 将 dataframe 列的条目与列表匹配并基于匹配创建新列 - matching entries of a dataframe column with a list and creating a new column based on match 在一列中模糊匹配字符串,并使用Fuzzywuzzy创建新的数据框 - Fuzzy match strings in one column and create new dataframe using fuzzywuzzy Map 计数器 Object 到 DataFrame 创建新列 - Map Counter Object to DataFrame to create new column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM