如何使用 python 從特定關鍵字中提取有限的數據行

Question

我有一個文本文件，我需要在其中提取段落中出現指定關鍵字的前五行。

我能夠找到關鍵字，但無法從該關鍵字中寫出接下來的五行。

mylines = []                              

with open ('D:\\Tasks\\Task_20\\txt\\CV (4).txt', 'rt') as myfile:  

    for line in myfile:                   

        mylines.append(line)             

    for element in mylines:               

        print(element, end='')  

print(mylines[0].find("P"))

如果有人對如何做到這一點有任何想法，請提供幫助。

輸入文本文件示例：-

菲律賓合作機構：ALL POWER STAFFING SOLUTIONS, INC.

培訓目標：：在酒店管理領域擁有國際文化接觸和實踐經驗，作為通往有意義的酒店職業生涯的門戶。 發展我的酒店管理技能並具有全球競爭力。

教育機構名稱：SOUTHVILLE FOREIGN UNIVERSITY - PHILIPPINES 地點 Hom as Pinas City, Philippine 機構開課日期：（2007 年 6 月

需要 Output:-

培訓目標：：在酒店管理領域擁有國際文化接觸和實踐經驗，作為通往有意義的酒店職業生涯的門戶。 發展我的酒店管理技能並具有全球競爭力。

#

我必須在文本文件中搜索培訓目標關鍵字，並且發現它應該只寫下 5 行。

Answer 1

如果您只是想提取整個“培訓目標”塊，請查找關鍵字並繼續添加行，直到您找到空行（或其他合適的標記，例如下一個 header）。

（編輯以處理多個文件和關鍵字）

def extract_block(filename, keywords):
    mylines = []
    with open(filename) as myfile:
        save_flag = False
        for line in myfile:
            if any(line.startswith(kw) for kw in keywords):
                save_flag = True
            elif line.strip() == '':
                save_flag = False
            if save_flag:
                mylines.append(line)
    return mylines

filenames = ['file1.txt', 'file2.txt', 'file3.txt']
keywords = ['keyword1', 'keyword2', 'keyword3']
for filename in filenames:
    block = extract_block(filename, keywords)

這假設每個文件中只有 1 個塊。 如果您從每個文件中提取多個塊，它會變得更加復雜。

如果您真的每次都想要 5 行，那么您可以做類似的事情，但添加一個計數器來計算您的 5 行。

Answer 2

嘗試這個：

with open('test.txt') as f:
    content = f.readlines()
index = [x for x in range(len(content)) if 'training objectives' in content[x].lower()]
for num in index:
    for lines in content[num:num+5]:
        print (lines)

如果你只有幾句話（只是為了獲取索引）：

index = []
for i, line in enumerate(content):
    if 'hello' in line or 'there' in line:     //add your or + word here
        index.append(i)
print(index)

如果你有很多（只是為了獲得索引）：

list = ["hello","there","blink"]    //insert your words here
index = []
for i, line in enumerate(content):
    for items in list:
        if items in line:
            index.append(i)
print(index)

Answer 3

這取決於你在哪里\n，但我將一個正則表達式放在一起，這可能有助於我的文本在變量 st 中的外觀示例：

In [254]: st                                                                                  

Out[254]: 'Philippine Partner Agency: ALL POWER STAFFING SOLUTIONS, INC.\n\nTraining Objectives::\nTo have international cultural exposure and hands-on experience \nin the field of hospitality management as a gateway to a meaningful hospitality career. \nTo develop my hospitality management skills and become globally competitive.\n\n\nEducation Institution Name: SOUTHVILLE FOREIGN UNIVERSITY - PHILIPPINES Location Hom as Pinas City, Philippine Institution start date: (June 2007\n'

impore re

re.findall('Training Objectives:.*\n((?:.*\n){1,5})', st)   

Out[255]: ['To have international cultural exposure and hands-on experience \nin the field of hospitality management as a gateway to a meaningful hospitality career. \nTo develop my hospitality management skills and become globally competitive.\n\n\n']

如何使用 python 從特定關鍵字中提取有限的數據行

問題描述

3 個解決方案

解決方案1
1 已采納 2019-11-07 06:51:17

解決方案2
0 2019-11-07 06:31:48

解決方案3
0 2019-11-07 06:33:10

如何使用 python 從特定關鍵字中提取有限的數據行

問題描述

3 個解決方案

解決方案1 1 已采納 2019-11-07 06:51:17

解決方案2 0 2019-11-07 06:31:48

解決方案3 0 2019-11-07 06:33:10

解決方案1
1 已采納 2019-11-07 06:51:17

解決方案2
0 2019-11-07 06:31:48

解決方案3
0 2019-11-07 06:33:10