简体   繁体   English

从 csv 文件创建字典?

[英]Creating a dictionary from a csv file?

I am trying to create a dictionary from a csv file.我正在尝试从 csv 文件创建字典。 The first column of the csv file contains unique keys and the second column contains values. csv 文件的第一列包含唯一键,第二列包含值。 Each row of the csv file represents a unique key, value pair within the dictionary. csv 文件的每一行代表字典中唯一的键值对。 I tried to use thecsv.DictReader andcsv.DictWriter classes, but I could only figure out how to generate a new dictionary for each row.我尝试使用csv.DictReadercsv.DictWriter类,但我只能弄清楚如何为每一行生成一个新字典。 I want one dictionary.我想要一本字典。 Here is the code I am trying to use:这是我尝试使用的代码:

import csv

with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
    writer = csv.writer(outfile)
    for rows in reader:
        k = rows[0]
        v = rows[1]
        mydict = {k:v for k, v in rows}
    print(mydict)

When I run the above code I get a ValueError: too many values to unpack (expected 2) .当我运行上面的代码时,我得到一个ValueError: too many values to unpack (expected 2) How do I create one dictionary from a csv file?如何从 csv 文件创建一本字典? Thanks.谢谢。

I believe the syntax you were looking for is as follows:我相信您正在寻找的语法如下:

import csv

with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = {rows[0]:rows[1] for rows in reader}

Alternately, for python <= 2.7.1, you want:或者,对于 python <= 2.7.1,您需要:

mydict = dict((rows[0],rows[1]) for rows in reader)

Open the file by calling open and then using csv.DictReader .通过调用 open 然后使用csv.DictReader打开文件。

input_file = csv.DictReader(open("coors.csv"))

You may iterate over the rows of the csv file dict reader object by iterating over input_file.您可以通过遍历 input_file 来遍历 csv 文件字典阅读器 object 的行。

for row in input_file:
    print(row)

OR To access first line only或 仅访问第一行

dictobj = csv.DictReader(open('coors.csv')).next() 

UPDATE In python 3+ versions, this code would change a little:更新在 python 3+ 版本中,此代码会稍作更改:

reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader) 
import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
   k, v = row
   d[k] = v

This isn't elegant but a one line solution using pandas.这并不优雅,而是使用 pandas 的单线解决方案。

import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()

If you want to specify dtype for your index (it can't be specified in read_csv if you use the index_col argument because of a bug ):如果您想为您的索引指定 dtype(如果您使用 index_col 参数,则由于错误而无法在 read_csv 中指定):

import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()

You have to just convert csv.reader to dict:您只需将 csv.reader 转换为 dict:

~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3

~ >> cat > d.py
import csv
with open('1.csv') as f:
    d = dict(filter(None, csv.reader(f)))

print(d)

~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}

You can also use numpy for this.您也可以为此使用 numpy。

from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }

Assuming you have a CSV of this structure:假设你有一个 CSV 这个结构:

"a","b"
1,2
3,4
5,6

And you want the output to be:您希望 output 为:

[{'a': '1', ' "b"': '2'}, {'a': '3', ' "b"': '4'}, {'a': '5', ' "b"': '6'}]

A zip function (not yet mentioned) is simple and quite helpful. zip function(尚未提及)非常简单且非常有用。

def read_csv(filename):
    with open(filename) as f:
        file_data=csv.reader(f)
        headers=next(file_data)
        return [dict(zip(headers,i)) for i in file_data]

If you prefer pandas, it can also do this quite nicely:如果你更喜欢 pandas,它也可以很好地做到这一点:

import pandas as pd
def read_csv(filename):
    return pd.read_csv(filename).to_dict('records')

One-liner solution单线解决方案

import pandas as pd

dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}

For simple csv files, such as the following对于简单的csv文件,比如下面

id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3

You can convert it to a Python dictionary using only built-ins您可以仅使用内置函数将其转换为 Python 字典

with open(csv_file) as f:
    csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]

(_, *header), *data = csv_list
csv_dict = {}
for row in data:
    key, *values = row   
    csv_dict[key] = {key: value for key, value in zip(header, values)}

This should yield the following dictionary这应该产生以下字典

{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
 'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
 'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
 'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}

Note: Python dictionaries have unique keys, so if your csv file has duplicate ids you should append each row to a list.注意: Python 字典有唯一的键,所以如果你的 csv 文件有重复的ids你应该 append 每一行到一个列表。

for row in data:
    key, *values = row

    if key not in csv_dict:
            csv_dict[key] = []

    csv_dict[key].append({key: value for key, value in zip(header, values)})

I'd suggest adding if rows in case there is an empty line at the end of the file我建议添加if rows以防文件末尾有空行

import csv
with open('coors.csv', mode='r') as infile:
    reader = csv.reader(infile)
    with open('coors_new.csv', mode='w') as outfile:
        writer = csv.writer(outfile)
        mydict = dict(row[:2] for row in reader if row)

If you are OK with using the numpy package, then you can do something like the following:如果您可以使用 numpy package,那么您可以执行以下操作:

import numpy as np

lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
   my_dict[lines[i][0]] = lines[i][1]

with pandas, it is much easier, for example.例如,使用 pandas 会容易得多。 assuming you have the following data as CSV and let's call it test.txt / test.csv (you know CSV is a sort of text file )假设您有以下数据为 CSV ,我们称之为test.txt / test.csv (您知道 CSV 是一种文本文件)

a,b,c,d
1,2,3,4
5,6,7,8

now using pandas现在使用 pandas

import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()

for each row, it would be对于每一行,它将是

df.to_dict(orient='records')

and that's it.就是这样。

You can use this, it is pretty cool:你可以使用它,它很酷:

import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
      records, metadata = commas.parse(f)
      for row in records:
            print 'this is row in dictionary:'+rowenter code here

Try to use a defaultdict and DictReader .尝试使用defaultdictDictReader

import csv
from collections import defaultdict
my_dict = defaultdict(list)

with open('filename.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for line in csv_reader:
        for key, value in line.items():
            my_dict[key].append(value)

It returns:它返回:

{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}

Many solutions have been posted and I'd like to contribute with mine, which works for a different number of columns in the CSV file.许多解决方案已经发布,我想贡献我的,它适用于 CSV 文件中不同数量的列。 It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.它创建一个每列一个键的字典,每个键的值是一个包含该列中元素的列表。

    input_file = csv.DictReader(open(path_to_csv_file))
    csv_dict = {elem: [] for elem in input_file.fieldnames}
    for row in input_file:
        for key in csv_dict.keys():
            csv_dict[key].append(row[key])

If you have:如果你有:

  1. Only 1 key and 1 value as key,value in your csv您的 csv 中只有 1 个键和 1 个值作为键、值
  2. Do not want to import other packages不想导入其他包
  3. Want to create a dict in one shot想要一次创建一个字典

Do this:做这个:

mydict = {y[0]: y[1] for y in [x.split(",") for x in open('file.csv').read().split('\n') if x]}

What does it do?它有什么作用?

It uses list comprehension to split lines and the last "if x" is used to ignore blank line (usually at the end) which is then unpacked into a dict using dictionary comprehension.它使用列表推导来分割行,最后一个“if x”用于忽略空行(通常在末尾),然后使用字典推导将其解压缩到字典中。

here is an approach for CSV to Dict:这是 CSV 到字典的方法:

import pandas

data = pandas.read_csv('coors.csv')

the_dictionary_name = {row.k: row.v for (index, row) in data.iterrows()}

The question derailed us from the correct solution... which requires taking a step back and asking if we chose the correct format to store dictionary data?这个问题使我们偏离了正确的解决方案……这需要退后一步,询问我们是否选择了正确的格式来存储字典数据? For a dictionary, a CSV file is a lossy format that silently casts all numeric values to string values... so the correct answer would be IMO to save it to JSON in the first place.对于字典,CSV 文件是一种有损格式,它会默默地将所有数值转换为字符串值......所以正确的答案是 IMO 首先将其保存到 JSON

And then simply:然后简单地说:

import json
my_dict = json.load(open('my_file.json', 'r'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM