[英]How to create a dictionary based off CSV file
I have a csv file with the following format: 我有一个csv文件,格式如下:
#ID #Number #Date #Name #Email
1978 26 24/4/10 Jim Jim@randomemail.com
1328 31 22/7/10 Jim Kim@randomemail.com
1908 26 21/4/10 Jim Dim@randomemail.com
1918 26 29/4/10 Jim Rim@randomemail.com
1938 46 24/4/10 Jim Lim@randomemail.com
I have opened the csv file and printed it out already. 我打开了csv文件并将其打印出来。
I now want to make it so it's made into a dictionary such as: [ID: 1978, Number : 26, Date : 24/4/10, Name : Jim, Email : Jim@randomemail.com], [etc], [etc] 我现在想把它变成一本字典,如:[ID:1978,Number:26,Date:24/4/10,Name:Jim,Email:Jim@randomemail.com],[etc],[等等]
I know this is probably very easy but I'm new and have been stuck for a few hours. 我知道这可能很容易,但我很新,已经被困了几个小时。
Following up on my comment, consider something like: 跟进我的评论,考虑如下:
import csv
with open('file.txt', 'r') as f:
reader = csv.DictReader(f, delimiter=' ', skipinitialspace=True)
for row in reader:
print(row)
Output: 输出:
OrderedDict([('#ID', '1978'), ('#Number', '26'), ('#Date', '24/4/10'), ('#Name', 'Jim'), ('#Email', 'Jim@randomemail.com')]) OrderedDict([('#ID', '1328'), ('#Number', '31'), ('#Date', '22/7/10'), ('#Name', 'Jim'), ('#Email', 'Kim@randomemail.com')]) OrderedDict([('#ID', '1908'), ('#Number', '26'), ('#Date', '21/4/10'), ('#Name', 'Jim'), ('#Email', 'Dim@randomemail.com')]) OrderedDict([('#ID', '1918'), ('#Number', '26'), ('#Date', '29/4/10'), ('#Name', 'Jim'), ('#Email', 'Rim@randomemail.com')]) OrderedDict([('#ID', '1938'), ('#Number', '46'), ('#Date', '24/4/10'), ('#Name', 'Jim'), ('#Email', 'Lim@randomemail.com')])
The two extra arguments to DictReader
are necessary to get your variable-space-delimited file to parse correctly. DictReader
的两个额外参数是DictReader
可变空格分隔文件正确解析所必需的。
Or, if you want all the rows at once, something like: 或者,如果您想同时拥有所有行,例如:
import csv
with open('file.txt', 'r') as f:
reader = csv.DictReader(f, delimiter=' ', skipinitialspace=True)
rows = list(reader)
print(rows)
produces 产生
[ OrderedDict([('#ID', '1978'), ('#Number', '26'), ('#Date', '24/4/10'), ('#Name', 'Jim'), ('#Email', 'Jim@randomemail.com')]), OrderedDict([('#ID', '1328'), ('#Number', '31'), ('#Date', '22/7/10'), ('#Name', 'Jim'), ('#Email', 'Kim@randomemail.com')]), OrderedDict([('#ID', '1908'), ('#Number', '26'), ('#Date', '21/4/10'), ('#Name', 'Jim'), ('#Email', 'Dim@randomemail.com')]), OrderedDict([('#ID', '1918'), ('#Number', '26'), ('#Date', '29/4/10'), ('#Name', 'Jim'), ('#Email', 'Rim@randomemail.com')]), OrderedDict([('#ID', '1938'), ('#Number', '46'), ('#Date', '24/4/10'), ('#Name', 'Jim'), ('#Email', 'Lim@randomemail.com')]) ]
and, 和,
print(rows[0]["#Email"])
produces 产生
Jim@randomemail.com
Update 更新
If your file is actually tab delimited, you could use: 如果您的文件实际上是制表符分隔符,则可以使用:
reader = csv.DictReader(f, delimiter='\t')
You should be able to tell what the delimiter by printing the line (as you already have), but wrap it in a repr
call -- something like print(repr(line))
. 您应该能够通过打印行(就像您已经拥有的那样)来分辨分隔符,但是将其包装在repr
调用中 - 类似于print(repr(line))
。 If you see a \\t
in the output, it's tab delimited. 如果在输出中看到\\t
,则以制表符分隔。
Here's some code written in pure python that'll do the trick: 这里是用纯python编写的一些代码,它们可以解决这个问题:
for line in file_contents_2:
line_contents = line.strip().split(",") # Removes the \n,
# then turns the line into a list, where each value is seperated
# by the comma
the_dictionary = {}
reference = ["ORIN","DEST","HORIZ","BEAR"]
for i in range(4): # iterates i=0 to i=3
# Arrays start at 0, so a=[1,2,3]; a[1] would return 2
the_dictionary[reference[i]] = line_contents[i]
dictionary_list.append(the_dictionary)
Using pandas will make your life much easier: 使用熊猫会让您的生活更轻松:
import pandas as pd
df = pd.read_csv('path_to_your_csv')
your_dict = df.to_dict()
That's it, there are some optional arguments in to_dict
to help you format it the way you want. 就是这样, to_dict
有一些可选参数可以帮助您按照自己的方式进行格式化。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.