简体   繁体   English

在 dataframe 的字符串中查找 substring 的索引

[英]Find index of substring within string from a dataframe

I have a dataframe with two columns (and alot of rows), one column is the full sequence the other contains a sub sequence.我有一个 dataframe 有两列(和很多行),一列是完整序列,另一列contains a sub sequence.

I want to find the index of where the sub sequence starts within the full sequence and add this as a another column:我想找到子序列在完整序列中开始的索引,并将其添加为另一列:

I have tried this:我试过这个:

df["start"] = df.sequence.index(df.sub_sequence)

But this returns: TypeError: 'RangeIndex' object is not callable但这会返回: TypeError: 'RangeIndex' object is not callable

What am i doing wrong?我究竟做错了什么?

Heres the df and the df i wish to end up with:这是我希望得到的df和df:

Sample dataframe:样品 dataframe:

import pandas as pd 

data = {"sequence": ["abcde","fghij","klmno"], "sub_sequence": ["cde", "gh", "no"]}    
df = pd.DataFrame (data, columns = ['sequence','sub_sequence'])

  sequence sub_sequence
0    abcde          cde
1    fghij           gh
2    klmno           no

Expected result:预期结果:

data2 = {"sequence": ["abcde","fghij","klmno"], "sub_sequence": ["cde", "gh", "no"], "start": [2,1,3]}
df2 = pd.DataFrame (data2, columns = ['sequence','sub_sequence','start'])

  sequence sub_sequence  start
0    abcde          cde      2
1    fghij           gh      1
2    klmno           no      3

Use zip andstr.index in a list comprehension:在列表理解中使用zipstr.index

df['start'] = [seq.index(sub) for seq, sub in zip(df['sequence'], df['sub_sequence'])]

OR Use DataFrame.apply along axis=1 +str.index :或使用DataFrame.apply沿axis=1 +str.index

df['start'] = df[['sequence', 'sub_sequence']].apply(lambda s: str.index(*s), axis=1)

Result:结果:

  sequence sub_sequence  start
0    abcde          cde      2
1    fghij           gh      1
2    klmno           no      3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:在字符串中查找子串并返回子串的索引 - Python: Find a substring in a string and returning the index of the substring Python:按索引在字符串中查找 substring - Python: find substring in string by index 查找子字符串是否包含在字符串中 - Find if a substring is enclosed within a string 在字符串中的部分查找 substring - Find a substring in parts within a string 使用startswith和index从结构化字符串中查找substring - Find a substring from a structured string using startswith and index 从dataframe中的字符串中提取子字符串 - Extract substring from string in dataframe 尝试在字符串中查找所有出现的 substring,并在之后保留 n 个字符 Python Pandas Dataframe - Trying to find all occurrences of a substring within a string, and also keep n characters afterwards in Python Pandas Dataframe Python - 在使用 FOR 循环迭代 pandas DataFrame 时,使用 IF 语句在字符串中查找 substring - Python - Find a substring within a string using an IF statement when iterating through a pandas DataFrame with a FOR loop 从给定的字符串中找到 substring 旁边的 substring - Find a substring next to a substring from a given string 查找字符串中子字符串最后一次出现的索引 - Find index of last occurrence of a substring in a string
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM