使用python从航空文本中提取段落

Question

I am new in python and i try to extract a text from a paragraph using python. 我是python的新手，我尝试使用python从段落中提取文本。 the text is: 文本是：

<stx>(FPL-ACF66-IN
-EH30/H-S/C
-LGKR0900
-N0100VFR KRK ARA
-LGTG0300
-DOF/120928)
<etx>
<stx>GG
(APL-ACF66-IN
-EH30/H-S/C
-LGKR0900
-N0100VFR KRK ARA
-LGTG0300
-DOF/110928)
<etx>
<stx>
(CNL-ACF66-IN
-EH30/H-S/C
-LGKR0900
-N0100VFR KRK ARA
-LGTG0300
-DOF/120928)<etx>

and i want to extract from FPL till -DOF/120928) the whole pagagraph 我想从FPL中提取，直到-DOF / 120928）整个pagagraph

(FPL-ACF66-IN
-EH30/H-S/C
-LGKR0900
-N0100VFR KRK ARA
-LGTG0300
-DOF/120928)

i use that code but it extracts only the first line : FPL-ACF66-IN 我使用该代码，但仅提取第一行：FPL-ACF66-IN

import re

with open('FPL.txt', 'r', encoding = 'utf-8') as f:
        works = f.read()

        pattern = 'FPL'+'.*'
        w =re.findall(pattern, works, re.I)
        for work in w:
            print(work)

what is my fault; 我的错是什么？

Answer 1

While you surely can use regular expressions like ( see a demo here , mind the modifiers) 虽然您可以肯定使用正则表达式，例如（ 请参阅此处的演示 ，请注意修饰符）

\(FPL.+?-DOF/120928\)

this looks to me like some sort of xml file, so why don't you use a parser instead? 在我看来，这看起来像某种xml文件，那么为什么不使用解析器呢？

Snippet in Python : Python中的Python ：

 import re rx = re.compile(r'\\(FPL.+?-DOF/120928\\)', re.DOTALL) with open("test.txt") as fp: data = fp.read() try: paragraph = rx.search(data).group(0) except: paragraph = None print(paragraph)

This yields 这产生

 (FPL-ACF66-IN -EH30/HS/C -LGKR0900 -N0100VFR KRK ARA -LGTG0300 -DOF/120928)

If you want to have all paragraphs here, you could use 如果您想在此处查看所有段落，可以使用

<stx>(.+?)<etx>

Or even 甚至

import re

rx = re.compile(r'<stx>(.+?)<etx>', re.DOTALL)

with open("test.txt") as fp:
    data = fp.read()
    paragraphs = (m.group(1) for m in rx.finditer(data))

    for p in paragraphs:
        print(p)

And loop over them afterwards, see the modified demo and this one for stx and etx . 然后循环遍历它们，请参阅修改后的演示以及有关stx和etx 演示。
For the latter: 对于后者：

 import re rx = re.compile(r'<stx>(.+?)<etx>', re.DOTALL) with open("test.txt") as fp: data = fp.read() paragraphs = (m.group(1) for m in rx.finditer(data)) for p in paragraphs: print(p)

使用python从航空文本中提取段落

问题描述

1 个解决方案

解决方案1
0 2017-10-24 09:34:39

使用python从航空文本中提取段落

问题描述

1 个解决方案

解决方案1 0 2017-10-24 09:34:39

解决方案1
0 2017-10-24 09:34:39