简体   繁体   中英

Create nested dictionary from text file

I have a text file that looks something like the following:

DAY:Monday
Banana,Banana,Dragon Fruit
Dragon Fruit,Ice Pops,Ice Pops
Eggs,Dragon Fruit,Hamburger Buns,Dragon Fruit,Carrot,Apple,Banana
Ice Pops,Carrot,Dragon Fruit,Banana,Eggs,Eggs,Eggs,Eggs
DAY:Tuesday
Banana,Hamburger Buns,Dragon Fruit,Ice Pops
Hamburger Buns,Dragon Fruit,Dragon Fruit,Carrot,Apple,Carrot,Carrot,Eggs,Apple,Ice Pops
Carrot,Ice Pops,Apple,Dragon Fruit,Ice Pops,Apple,Banana,Eggs
Banana,Carrot,Eggs,Carrot,Eggs,Apple,Eggs
Carrot,Eggs,Hamburger Buns,Dragon Fruit,Apple,Hamburger Buns,Carrot,Dragon Fruit
Dragon Fruit,Ice Pops,Hamburger Buns,Hamburger Buns,Banana,Hamburger Buns,Carrot
DAY:Wednesday
Banana,Banana,Ice Pops
Apple,Carrot,Hamburger Buns
Apple,Carrot,Carrot,Carrot,Dragon Fruit,Carrot,Apple,Carrot,Dragon Fruit,Hamburger Buns
Apple,Hamburger Buns,Dragon Fruit,Ice Pops

I have code that is meant to count the number of occurrences of a word given each line. I do this so that I could calculate the total sales of the store, ideally per day. But I'm having trouble organizing this text file into nested dictionaries

This is the code I have thus far for creating a main dictionary with each key for each day:

weekly_sales = {}
source:str
purchases = []
for line in data_purchases:
    
    if line.split(":")[0] == "DAY": #Had to separate between the header and the rest. Once this statement is done, it would remove the first row
        source = line.strip().rpartition(":")[-1]
        if source not in weekly_sales:
            weekly_sales[source] = {}

And this is the code I have for counting the number of word occurrences.

for line in data_purchases:
    if line.split(":")[0] == "DAY":
        line = next(data_purchases) 
        
        wordsCount = {}
        for item in line.split(",")[1:]: #.split so i can get each element in the lime
            item = item.strip() #strip \n from the elements. otherwise, the output would look like Dragon Fruit\n for instance
            if item not in wordsCount:
                wordsCount[item] = 1
            else:
                wordsCount[item] += 1
                    
    else: #for every other line of the data
        wordsCount = {}
        for item in line.split(','):
            item = item.strip()
            if item not in wordsCount:
                wordsCount[item] = 1
            else:
                wordsCount[item] += 1  

How can I input these wordCount results under the corresponding days in the main weekly_sales dictionary?

You don't have to iterate over data_purchases twice. Just one is enough. Also, use collections.Counter to easily count the different items in each day.

For example:

from collections import Counter


with open("your_file.txt", "r") as f_in:
    out, current_day = {}, None
    for line in map(str.strip, f_in):
        if line == "":
            continue

        if line.startswith("DAY:"):
            out[(current_day := line.split(":")[-1])] = Counter()
        else:
            out[current_day].update(map(str.strip, line.split(",")))

# (optional) convert the Counter back to normal dict:
for k in out:
    out[k] = dict(out[k])

print(out)

Prints:

{
    "Monday": {
        "Banana": 4,
        "Dragon Fruit": 5,
        "Ice Pops": 3,
        "Eggs": 5,
        "Hamburger Buns": 1,
        "Carrot": 2,
        "Apple": 1,
    },
    "Tuesday": {
        "Banana": 4,
        "Hamburger Buns": 7,
        "Dragon Fruit": 7,
        "Ice Pops": 5,
        "Carrot": 9,
        "Apple": 6,
        "Eggs": 6,
    },
    "Wednesday": {
        "Banana": 2,
        "Ice Pops": 2,
        "Apple": 4,
        "Carrot": 6,
        "Hamburger Buns": 3,
        "Dragon Fruit": 3,
    },
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM