简体   繁体   中英

Extracting a section of a string in python with limitations

I have a string output that looks like this:

Distance AAAB: ,0.13634,0.13700,0.00080,0.00080,-0.00066,.00001,
Distance AAAC: ,0.12617,0.12680,0.00080,0.00080,-0.00063,,
Distance AAAD: ,0.17045,0.16990,0.00080,0.00080,0.00055,,
Distance AAAE: ,0.09330,0.09320,0.00080,0.00080,0.00010,,
Distance AAAF: ,0.21048,0.21100,0.00080,0.00080,-0.00052,,
Distance AAAG: ,0.02518,0.02540,0.00040,0.00040,-0.00022,,
Distance AAAH: ,0.11404,0.11450,0.00120,0.00110,-0.00046,,
Distance AAAI: ,0.10811,0.10860,0.00080,0.00070,-0.00049,,
Distance AAAJ: ,0.02430,0.02400,0.00200,0.00200,0.00030,,
Distance AAAK: ,0.09449,0.09400,0.00200,0.00100,0.00049,,
Distance AAAL: ,0.07689,0.07660,0.00050,0.00050,0.00029,

What I want to do is extract a specific set of data out of this block, for example only Distance AAAH like so:

Distance AAAH: ,0.11404,0.11450,0.00120,0.00110,-0.00046,,

The measurements will always begin with Distance AAA*: with the star being the only character that will change.

Complications: This needs to be generic, because I have a lot of different data sets and so Distance AAAH might not always be followed by Distance AAAI or preceded by Distance AAAG, since the measurements for different items vary. I also can't rely on .len(), because the last measurement can sometimes be blank (As it is with Distance AAAH) or can be filled (As with Distance AAAB. And I don't think I can use .find(), because I need all of the numbers following Distance AAAH.

I am still very new and I tried my best to find a solution similar to this problem, but have not had much luck.

You could use re module. And making a function should be convenient.

import re
def SearchDistance(pattern,text):
    pattern = pattern.replace(' ','\s')
    print re.findall(r'{0}.+'.format(pattern),a)

SearchDistance('Distance AAAH',a)

Output:

['Distance AAAH: ,0.11404,0.11450,0.00120,0.00110,-0.00046,,']

You can search your text by this script :

#fullText = YOUR STRING
text = fullText.splitlines()
for line in text:
    if line.startswith('Distance AAAH:'):
        print line

Output: Distance AAAH: ,0.11404,0.11450,0.00120,0.00110,-0.00046,,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM