在Python中從txt文件中查找兩個輸入單詞之后的單詞

Question

我正在制作一個字典，其中鍵是 txt 文件中兩個連續單詞的元組，每個鍵的值是直接在鍵后面找到的單詞列表。 例如，

>>> with open('alice.txt') as f: 
... d = associated_words(f) 
>>> d[('among', 'the')] 
>>> ['people', 'party.', 'trees,', 'distant', 'leaves,', 'trees', 'branches,', 'bright']

到目前為止，我的代碼如下，但尚未完成。 有人可以幫忙嗎？

def associated_words(f):
    from collections import defaultdict    
    d = defaultdict(list)
    with open('alice.txt', 'r') as f:
        lines = f.read().replace('\n', '') 

    a, b, c = [], [], []     
    lines.replace(",", "").replace(".", "")
    lines = line.split(" ")
    for (i, word) in enumerate(lines):
        d['something to replace'].append(lines[i+2])

Answer 1

像這樣的東西？ （應該很容易適應......）

from pathlib import Path
from collections import defaultdict

DATA_PATH = Path(__file__).parent / '../data/alice.txt'

def next_word(fh):
    '''
    a generator that returns the next word from the file; with special
    characters removed; lower case.
    '''
    transtab = str.maketrans(',.`:;()?!—', '          ') # replace unwanted chars
    for line in fh.readlines():
        for word in line.translate(transtab).split():
            yield word.lower()

def handle_triplet(dct, triplet):
    '''
    add a triplet to the dictionary dct
    '''
    dct[(triplet[0], triplet[1])].append(triplet[2])

dct = defaultdict(list) # dictionary that defaults to []

with DATA_PATH.open('r') as fh:
    generator = next_word(fh)
    triplet = (next(generator), next(generator),  next(generator))
    handle_triplet(dct, triplet)
    for word in generator:
        triplet = (triplet[1], triplet[2], word)
        handle_triplet(dct, triplet)

print(dct)

輸出（摘錄...；不在整個文本上運行）

defaultdict(<class 'list'>, {
    ('enough', 'under'): ['her'], ('rattle', 'of'): ['the'],
    ('suppose', 'they'): ['are'], ('flung', 'down'): ['his'],
    ('make', 'with'): ['the'], ('ring', 'and'): ['begged'],
    ('taken', 'his'): ['watch'], ('could', 'show'): ['you'],
    ('said', 'tossing'): ['his'], ('a', 'bottle'): ['marked', 'they'],
    ('dead', 'silence'): ['instantly', 'alice', "'it's"], ...

Answer 2

假設你的文件看起來像這樣

each them theirs tree life what not hope

代碼：

lines = [line.strip().split(' ') for line in open('test.txt')]

d = {}
for each in lines:
    d[(each[0],each[1])] = each[2:]
print d

在Python中從txt文件中查找兩個輸入單詞之后的單詞

問題描述

2 個解決方案

解決方案1
0 2015-07-29 16:54:58

解決方案2
0 2015-07-29 17:13:27

在Python中從txt文件中查找兩個輸入單詞之后的單詞

問題描述

2 個解決方案

解決方案1 0 2015-07-29 16:54:58

解決方案2 0 2015-07-29 17:13:27

解決方案1
0 2015-07-29 16:54:58

解決方案2
0 2015-07-29 17:13:27