简体   繁体   中英

extract content between nested parenthesis

I want to extract between ObjectUnionOf( and the first closed parenthese coming after it:

<http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion>

From

EquivalentClasses(<http://www.ifomis.org/bfo/1.1/spanTemporalRegion> ObjectUnionOf(<http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion>))

I tried:

content=content[content.find("ObjectUnionOf(")+1:content.find(")")]

but it doesn't work

Using Regex:

import re
s = "EquivalentClasses(<http://www.ifomis.org/bfo/1.1/spanTemporalRegion> ObjectUnionOf(<http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion>))"
m = re.search("ObjectUnionOf\((?P<links>.*?)\)", s)
if m:
    print( m.group('links') )

Output:

<http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion>

Use re.findall :

import re

txt = '''EquivalentClasses(<http://www.ifomis.org/bfo/1.1/spanTemporalRegion> ObjectUnionOf(<http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion>))'''

print('\n'.join(re.findall(r'ObjectUnionOf\((.*)\)\)', txt)))

# <http://www.ifomis.org/bfo/1.1/spanScatteredTemporalRegion> 
# <http://www.ifomis.org/bfo/1.1/spanConnectedTemporalRegion> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM