简体   繁体   中英

How do i extract only abbreviation following acronyms inside the brackets by mapping each Capital letter

 a = "The process maps are similar to Manual Excellence Process Framework (MEPF)"

input = "The process maps are similar to Manual Excellence Process Framework (MEPF)"

output = Manual Excellence Process Framework (MEPF)

I want to write a python scripts where I have that piece of text, from that I want to extract full for of given acronyms inside the brackets (MEPF) and full form is Manual Excellence Process Framework I want to append only full from by match each uppercase letter from inside the brackets.

my idea was when ever acronyms appears inside the bracket that will map each capital letter for example (MEPF) starting from last Letter F that will match last word befoure the bracket here it is Framwork, then P (Pocess) then E(Excellence ) finaly M (manual) so final output will be full form(Manual Excellence Process Framework) can you try once this way that will be realy helpfull for me

i make your question as a challenge i am a beginner so i hope to be this answer useful for you and thank you for you question:

a = "process maps are similar to Manual Excellence Process 
Framework (MEPF)"

full = ''
ind = a.index('(')
ind2 = a.index(')')
acr = a[ind+1:ind2]
for i in a.split():
    for j in range (len(acr)):
        if acr[j] == i[0] and len(i) > 1:
            word = i
            full = full  + word + ' '
print(full)

Using a simple regex and a bit of post-processing:

a = "I like International Business Machines (IBM). The Manual Excellence Process Framework (MEPF)"

import re
m = re.findall(r'([^)]+) \(([A-Z]+)\)', a)
out = {b: ' '.join(a.split()[-len(b):]) for a,b in m}

out

output:

{'IBM': 'International Business Machines',
 'MEPF': 'Manual Excellence Process Framework'}

If you want to check the the acronym actually matches the words:

out = {b: ' '.join(a.split()[-len(b):]) for a,b in m
       if all(x[0]==y for x,y in zip(a.split()[-len(b):], b))
       }

example

a = "No match (ABC). I like International Business Machines (IBM). The Manual Excellence Process Framework (MEPF)."

m = re.findall(r'([^)]+) \(([A-Z]+)\)', a)
{b: ' '.join(a.split()[-len(b):]) for a,b in m
 if all(x[0]==y for x,y in zip(a.split()[-len(b):], b))
}

# {'IBM': 'International Business Machines',
#  'MEPF': 'Manual Excellence Process Framework'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM