简体   繁体   English

检测带有特殊字符的字符串之间的 substring

[英]Detecting substring between strings with special characters

I have 6 types of strings (after performing strip() on them):我有 6 种类型的字符串(在对它们执行strip()之后):

string_1= 'This is a \'working draft\' sequence. It currently consists of 10 contigs. Gaps between the contigsare represented as runs of N. The order of the piecesis believed to be correct as given, however the sizesof the gaps between them are based on estimates that haveprovided by the submittor.This sequence will be replacedby the finished sequence as soon as it is available andthe accession number will be preserved.\n???UPDATE FROM "This record contains 83 individual sequencing reads that have not been assembled intocontigs. Runs of N are used to separate the readsand the order in which they appear is completelyarbitrary. Low-pass sequence sampling is useful foridentifying clones that may be gene-rich and allowsoverlap relationships among clones to be deduced.However, it should not be assumed that this clonewill be sequenced to completion. In the event thatthe record is updated, the accession number willbe preserved."???'

string_2= '???INSERT information???\n\nPlasmid; n/a; 100% of reads'

string_3= '???INSERT information???\n\ngap of      100 bp'

string_x= "This is a 'working draft' sequence. It currently consists of 10 contigs. Gaps between the contigsare represented as runs of N. The order of the piecesis believed to be correct as given, however the sizesof the gaps between them are based on estimates that haveprovided by the submittor.This sequence will be replacedby the finished sequence as soon as it is available andthe accession number will be preserved."

string_y= 'Plasmid; n/a; 100% of reads'

string_z= 'gap of      100 bp'

I am trying to make it so that:我正在努力做到这一点:

if string_x in string_1:
    print("true")

if string_y in string_2:
    print("true")

if string_z in string_3:
    print("true")

However, this would not work even if I strip all the strings.但是,即使我剥离所有字符串,这也行不通。

How should I go about doing this?我应该如何 go 这样做?

First you need to understand what str.strip() method does.首先,您需要了解str.strip()方法的作用。 It returns a copy of the string with the leading and trailing characters removed.它返回删除了前导字符和尾随字符的字符串副本。

Syntax: str. strip([chars])语法: str. strip([chars]) str. strip([chars])

Example 1示例 1

>>> '   spacious   '.strip()
'spacious'
>>> "AABAA".strip("A")
'B'
>>> "ABBA".strip("AB")
''
>>> "ABCABBA".strip("AB")
'C'

Example 2示例 2

>>> 'www.example.com'.strip('cmowz.') # this example extracts web address
'example'

For more you can read the Official Documentation.有关更多信息,您可以阅读官方文档。

You can try to shape your strings to one format您可以尝试将字符串调整为一种格式

string_1 = ''.join([x for x in string_1  if x.isalpha() or x == ' '])
string_x = ''.join([x for x in string_x  if x.isalpha() or x == ' '])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM