简体   繁体   English

将项目附加到字典Python

[英]Append items to dictionary Python

I am trying to write a function in python that opens a file and parses it into a dictionary. 我正在尝试在python中编写一个函数,该函数打开一个文件并将其解析为字典。 I am trying to make the first item in the list block the key for each item in the dictionary data . 我试图在列表中的第一项block在字典中的每个项目的关键data Then each item is supposed to be the rest of the list block less the first item. 然后,每个项目都应该是列表block的其余部分,而不是第一个项目。 For some reason though, when I run the following function, it parses it incorrectly. 但是由于某种原因,当我运行以下函数时,它会错误地解析它。 I have provided the output below. 我在下面提供了输出。 How would I be able to parse it like I stated above? 我如何能够像上面所说的那样解析它? Any help would be greatly appreciated. 任何帮助将不胜感激。

Function: 功能:

def parseData() :
    filename="testdata.txt"
    file=open(filename,"r+")

    block=[]
    for line in file:
        block.append(line)
        if line in ('\n', '\r\n'):
            album=block.pop(1)
            data[block[1]]=album
            block=[]
    print data

Input: 输入:

Bob Dylan
1966 Blonde on Blonde
-Rainy Day Women #12 & 35
-Pledging My Time
-Visions of Johanna
-One of Us Must Know (Sooner or Later)
-I Want You
-Stuck Inside of Mobile with the Memphis Blues Again
-Leopard-Skin Pill-Box Hat
-Just Like a Woman
-Most Likely You Go Your Way (And I'll Go Mine)
-Temporary Like Achilles
-Absolutely Sweet Marie
-4th Time Around
-Obviously 5 Believers
-Sad Eyed Lady of the Lowlands

Output: 输出:

{'-Rainy Day Women #12 & 35\n': '1966 Blonde on Blonde\n',
 '-Whole Lotta Love\n': '1969 II\n', '-In the Evening\n': '1979 In Through the Outdoor\n'}

You can use groupby to group the data using the empty lines as delimiters, use a defaultdict for repeated keys extending the rest of the values from each val returned from groupby after extracting the key/first element. 您可以使用groupby使用空行作为定界符对数据进行分组,对提取的key / first元素提取了groupby返回的每个val中的其余值时,可以使用defaultdict对重复的键进行扩展。

from itertools import groupby
from collections import defaultdict
d = defaultdict(list)
with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        # if k is True we have a section
       if k:
            # get key  "k" which is the first line
           # from each section, val will be the remaining lines
           k,*v = val
           # add or add to the existing key/value pairing
           d[k].extend(map(str.rstrip,v))
from pprint import pprint as pp
pp(d)

Output: 输出:

{'Bob Dylan\n': ['1966 Blonde on Blonde',
                 '-Rainy Day Women #12 & 35',
                 '-Pledging My Time',
                 '-Visions of Johanna',
                 '-One of Us Must Know (Sooner or Later)',
                 '-I Want You',
                 '-Stuck Inside of Mobile with the Memphis Blues Again',
                 '-Leopard-Skin Pill-Box Hat',
                 '-Just Like a Woman',
                 "-Most Likely You Go Your Way (And I'll Go Mine)",
                 '-Temporary Like Achilles',
                 '-Absolutely Sweet Marie',
                 '-4th Time Around',
                 '-Obviously 5 Believers',
                 '-Sad Eyed Lady of the Lowlands'],
 'Led Zeppelin\n': ['1979 In Through the Outdoor',
                    '-In the Evening',
                    '-South Bound Saurez',
                    '-Fool in the Rain',
                    '-Hot Dog',
                    '-Carouselambra',
                    '-All My Love',
                    "-I'm Gonna Crawl",
                    '1969 II',
                    '-Whole Lotta Love',
                    '-What Is and What Should Never Be',
                    '-The Lemon Song',
                    '-Thank You',
                    '-Heartbreaker',
                    "-Living Loving Maid (She's Just a Woman)",
                    '-Ramble On',
                    '-Moby Dick',
                    '-Bring It on Home']}

For python2 the unpack syntax is slightly different: 对于python2,解压缩语法略有不同:

with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        if k:
            k, v = next(val), val
            d[k].extend(map(str.rstrip, v))

If you want to keep the newlines remove the map(str.rstrip.. 如果要保留换行符,请删除map(str.rstrip..

If you want the album and songs separately for each artist: 如果您要分别为每个歌手提供专辑和歌曲:

from itertools import groupby
from collections import defaultdict

d = defaultdict(lambda: defaultdict(list))
with open("file.txt") as f:
    for k, val in groupby(f, lambda x: x.strip() != ""):
        if k:
            k, alb, songs = next(val),next(val), val
            d[k.rstrip()][alb.rstrip()] = list(map(str.rstrip, songs))

from pprint import pprint as pp

pp(d)



{'Bob Dylan': {'1966 Blonde on Blonde': ['-Rainy Day Women #12 & 35',
                                         '-Pledging My Time',
                                         '-Visions of Johanna',
                                         '-One of Us Must Know (Sooner or '
                                         'Later)',
                                         '-I Want You',
                                         '-Stuck Inside of Mobile with the '
                                         'Memphis Blues Again',
                                         '-Leopard-Skin Pill-Box Hat',
                                         '-Just Like a Woman',
                                         '-Most Likely You Go Your Way '
                                         "(And I'll Go Mine)",
                                         '-Temporary Like Achilles',
                                         '-Absolutely Sweet Marie',
                                         '-4th Time Around',
                                         '-Obviously 5 Believers',
                                         '-Sad Eyed Lady of the Lowlands']},
 'Led Zeppelin': {'1969 II': ['-Whole Lotta Love',
                              '-What Is and What Should Never Be',
                              '-The Lemon Song',
                              '-Thank You',
                              '-Heartbreaker',
                              "-Living Loving Maid (She's Just a Woman)",
                              '-Ramble On',
                              '-Moby Dick',
                              '-Bring It on Home'],
                  '1979 In Through the Outdoor': ['-In the Evening',
                                                  '-South Bound Saurez',
                                                  '-Fool in the Rain',
                                                  '-Hot Dog',
                                                  '-Carouselambra',
                                                  '-All My Love',
                                                  "-I'm Gonna Crawl"]}}

I guess this is what you want? 我想这就是你想要的?

Even if this is not the format you wanted, there are a few things you might learn from the answer: 即使这不是您想要的格式,您也可以从答案中学到一些东西:

And SE does not like a list being continued by code... SE不喜欢列表被代码继续...

#!/usr/bin/env python

""""Parse text files with songs, grouped by album and artist."""


def add_to_data(data, block):
    """
    Parameters
    ----------
    data : dict
    block : list

    Returns
    -------
    dict
    """
    artist = block[0]
    album = block[1]
    songs = block[2:]
    if artist in data:
        data[artist][album] = songs
    else:
        data[artist] = {album: songs}
    return data


def parseData(filename='testdata.txt'):
    """
    Parameters
    ----------
    filename : string
        Path to a text file.

    Returns
    -------
    dict
    """
    data = {}
    with open(filename) as f:
        block = []
        for line in f:
            line = line.strip()
            if line == '':
                data = add_to_data(data, block)
                block = []
            else:
                block.append(line)
        data = add_to_data(data, block)
    return data

if __name__ == '__main__':
    data = parseData()
    import pprint
    pp = pprint.PrettyPrinter(indent=4)
    pp.pprint(data)

which gives: 这使:

{   'Bob Dylan': {   '1966 Blonde on Blonde': [   '-Rainy Day Women #12 & 35',
                                                  '-Pledging My Time',
                                                  '-Visions of Johanna',
                                                  '-One of Us Must Know (Sooner or Later)',
                                                  '-I Want You',
                                                  '-Stuck Inside of Mobile with the Memphis Blues Again',
                                                  '-Leopard-Skin Pill-Box Hat',
                                                  '-Just Like a Woman',
                                                  "-Most Likely You Go Your Way (And I'll Go Mine)",
                                                  '-Temporary Like Achilles',
                                                  '-Absolutely Sweet Marie',
                                                  '-4th Time Around',
                                                  '-Obviously 5 Believers',
                                                  '-Sad Eyed Lady of the Lowlands']},
    'Led Zeppelin': {   '1969 II': [   '-Whole Lotta Love',
                                       '-What Is and What Should Never Be',
                                       '-The Lemon Song',
                                       '-Thank You',
                                       '-Heartbreaker',
                                       "-Living Loving Maid (She's Just a Woman)",
                                       '-Ramble On',
                                       '-Moby Dick',
                                       '-Bring It on Home'],
                        '1979 In Through the Outdoor': [   '-In the Evening',
                                                           '-South Bound Saurez',
                                                           '-Fool in the Rain',
                                                           '-Hot Dog',
                                                           '-Carouselambra',
                                                           '-All My Love',
                                                           "-I'm Gonna Crawl"]}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM