简体   繁体   中英

How to write data from a CSV file into a dictionary without importing CSV for Python

So i have this data in a file which is presented as :

    Commodity, USA, Canada, Europe, China, India, Australia
    Wheat,61.7,27.2,133.9,121,94.9,22.9
    Rice Milled,6.3, -,2.1,143,105.2,0.8
    Oilseeds,93.1,19,28.1,59.8,36.8,5.7
    Cotton,17.3, -,1.5,35,28.5,4.6

The Top Row being the Header and The first column being headers as well. The dashes represent no data.

The format of the returned dictionary is as follow:

  • The keys of the dictionary are the names of the countries.

  • The values are dictionaries containing the data for each country. The keys of these dictionaries are names of commodities, the values are the quantity produced by that country for a given commodity. If there is no data for the given commodity (that is a dash in the csv file), the commodity must not be included in the dictionary. For example, cotton must not be in the dictionary for Canada. Note, a '-' (dash) is different than the value 0.

From file above it should be represented as :

{’Canada’:{’Wheat’:27.2,’Oilseeds’:19}, ’USA’:{’Wheat’:61.7, ’Cotton’:17.3,...}, ...}

Confused on where to start or what to do. Been stuck for days

if you have no issue to import pandas module, then it can be done as follows

import pandas as pd
df = pd.read_csv('test2.csv', sep=',')
df.set_index('Commodity').to_json()

it will give you the following output

{" USA":{"Wheat":61.7,"Rice Milled":6.3,"Oilseeds":93.1,"Cotton":17.3}," Canada":{"Wheat":"27.2","Rice Milled":" -","Oilseeds":"19","Cotton":" -"}," Europe":{"Wheat":133.9,"Rice Milled":2.1,"Oilseeds":28.1,"Cotton":1.5}," China":{"Wheat":121.0,"Rice Milled":143.0,"Oilseeds":59.8,"Cotton":35.0}," India":{"Wheat":94.9,"Rice Milled":105.2,"Oilseeds":36.8,"Cotton":28.5}," Australia":{"Wheat":22.9,"Rice Milled":0.8,"Oilseeds":5.7,"Cotton":4.6}}

If you really want it without any imports (whysoever) the shortest thing I could come up with is the following:

with open('data_sample.txt') as f:
    lines = f.readlines()
    split_lines = [[i.strip() for i in l.split(',')] for l in lines]
    d = {}
    for i, line in enumerate(zip(*split_lines)):
        if i == 0:
            value_headers = line
            continue
        d[line[0]] = dict([(i,j) for i,j in zip(value_headers[1:], line[1:]) if j != '-' ])

print(d)

Out:

{'USA': {'Wheat': '61.7', 'Rice Milled': '6.3', 'Oilseeds': '93.1', 'Cotton': '17.3'}, 'Canada': {'Wheat': '27.2', 'Oilseeds': '19'}, 'Europe': {'Wheat': '133.9', 'Rice Milled': '2.1', 'Oilseeds': '28.1', 'Cotton': '1.5'}, 'China': {'Wheat': '121', 'Rice Milled': '143', 'Oilseeds': '59.8', 'Cotton': '35'}, 'India': {'Wheat': '94.9', 'Rice Milled': '105.2', 'Oilseeds': '36.8', 'Cotton': '28.5'}, 'Australia': {'Wheat': '22.9', 'Rice Milled': '0.8', 'Oilseeds': '5.7', 'Cotton': '4.6'}}

There might be better uses of zip etc, but it should give a general idea

If you do not plan to import any modules, this works too

data = {}

with open('data.txt') as f:
    column_dict = {}
    for i , line in enumerate(f):

        vals = line.rstrip().split(',')

        row_heading = vals[0]
        row_data = vals[1:]

        # Add column names as keys and empty dict as values for final data
        # Creating a header dict to keep track of index for columns
        if i ==0:
            data = {col.strip():{} for col in row_data}
            column_dict = {col.strip():i for i,col in enumerate(row_data)}
        else:
            for x in data.keys():
                #Exclude data with dashes
                if row_data[column_dict[x]].strip() != "-":
                    data[x][row_heading] = row_data[column_dict[x]]

print(data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM