簡體   English   中英

如何使用python從文本文件中創建字典

[英]How to make a dictionary from a text file with python

我的文件看起來像這樣:

aaien 12 13 39
aan 10
aanbad 12 13 14 57 58 38
aanbaden 12 13 14 57 58 38
aanbeden 12 13 14 57 58 38
aanbid  12 13 14 57 58 39
aanbidden 12 13 14 57 58 39
aanbidt 12 13 14 57 58 39
aanblik 27 28
aanbreken 39
...

我想用key = the word(比如'aaien')創建一個字典,值應該是它旁邊的數字列表。 所以它必須這樣看:{'aaien':['12,13,39'],'aan':['10']}

這段代碼似乎不起作用。

document = open('LIWC_words.txt', 'r')
liwcwords = document.read()
dictliwc = {}
for line in liwcwords:
    k, v = line.strip().split(' ')
    answer[k.strip()] = v.strip()

liwcwords.close()

python給出了這個錯誤:

ValueError: need more than 1 value to unpack

您將您的行拆分為單詞列表,但只給它一個鍵和值。

這將有效:

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        line = line.split()
        if not line:  # empty line?
            continue
        answer[line[0]] = line[1:]

請注意,您不需要給.split()一個參數; 沒有參數,它們都會在空格上分割並為您刪除結果 這可以節省你必須顯式調用.strip()

另一種方法是僅在第一個空格上拆分:

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        if line.strip():  # non-empty line?
            key, value = line.split(None, 1)  # None means 'all whitespace', the default
            answer[key] = value.split()

.split()的第二個參數限制了分割的數量,保證最多返回2個元素,從而可以將賦值中的值解包為keyvalue

這兩種方法都會導致:

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}

如果您仍然只看到一個鍵和文件的其余部分作為(拆分)值,則輸入文件可能使用非標准行分隔符。 通過將U字符添加到模式,打開具有通用行結束支持的文件:

with open('LIWC_words.txt', 'rU') as document:
>liwcwords = document.read()  
>dictliwc = {}    
>for line in liwcwords:

你在這里迭代一個字符串,這不是你想要的。 試試document.readlines() 這是另一種解決方案。

from pprint import pprint
with open('LIWC_words.txt') as fd:
    d = {}
    for i in fd:
        entry = i.split()
        if entry: d.update({entry[0]: entry[1:]})

pprint(d)

這是輸出的樣子

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM