繁体   English   中英

你如何将此文本文件转换为字典? (PYTHON)

[英]How do you convert this textfile into dictionary? (PYTHON)

我有一个 .txt 文件,内容如下:

Areca Palm
2018-11-03 18:21:26
Tropical/sub-Tropical plant
Leathery leaves, mid to dark green
Moist and well-draining soil
Semi-shade/full shade light requirements
Water only when top 2 inches of soil is dry
Intolerant to root rot
Propagate by cuttings in water

Canary Date Palm 
2018-11-05 10:12:15
Semi-shade, full sun
Dark green leathery leaves
Like lots of water,but soil cannot be water-logged
Like to be root bound in pot

我想将这些.txt 文件转换为 python 上的字典,并且 output 应该如下所示:

d = {'Areca Palm': ('2018-11-03 18:21:26', 'Tropical/sub-Tropical plant', 'Leathery leaves, mid to dark green', 'Moist and well-draining soil'..etc 'Canary Date Palm': ('2018-11-05 10:12:15', 'Semi-shade, full sun'...)

我该怎么做呢?

以下代码显示了一种方法,即使用非常简单的两态 state 机器读取文件:

with open("data.in") as inFile:
    # Initialise dictionary and simple state machine.

    afterBlank = True
    myDict = {}

    # Process each line in turn.

    for line in inFile.readlines():
        line = line.strip()

        if afterBlank:
            # First non-blank after blank (or at file start) is key
            # (blanks after blanks are ignored).

            if line != "":
                key = line
                myDict[key] = []
                afterBlank = False
        else:
            # Subsequent non-blanks are additional lines for key
            # (blank after non-blank switches state).

            if line != "":
                myDict[key].append(line)
            else:
                afterBlank = True

# Dictionary holds lists, make into tuples if desired.

for key in myDict.keys():
    myDict[key] = tuple(myDict[key])

import pprint
pprint.pprint(myDict)

使用您的输入数据给出 output (输出与pprint比标准 Python print更具可读性):

{'Areca Palm': ('2018-11-03 18:21:26',
                'Tropical/sub-Tropical plant',
                'Leathery leaves, mid to dark green',
                'Moist and well-draining soil',
                'Semi-shade/full shade light requirements',
                'Water only when top 2 inches of soil is dry',
                'Intolerant to root rot',
                'Propagate by cuttings in water'),
 'Canary Date Palm': ('2018-11-05 10:12:15',
                      'Semi-shade, full sun',
                      'Dark green leathery leaves',
                      'Like lots of water,but soil cannot be water-logged',
                      'Like to be root bound in pot')}

通过编写 function 来处理文件并一次产生一个有意义的部分,可以大大简化许多解析问题。 通常,这部分所需的逻辑非常简单。 而且它很简单,因为 function 不关心有关您的更大问题的任何其他细节。

然后,该步骤简化了下游代码,其重点是一次解构一个有意义的部分。 这部分可以忽略更大的文件问题——同时保持简单。

插图:

import sys

def get_paragraphs(path):
    par = []
    with open(path) as fh:         # The basic pattern tends to repeat:
        for line in fh:
            line = line.rstrip()
            if line:               # Store lines you want.
                par.append(line)
            elif par:              # Yield prior batch.
                yield par
                par = []
        if par:                    # Don't forget the last one.
            yield par

path = sys.argv[1]
d = {
    p[0] : tuple(p[1:])
    for p in get_paragraphs(path)
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM