[英]Reading csv file and returning as dictionary
我编写了一个当前可以正确读取文件的函数,但是有两个问题。 它需要作为字典返回,其中键是艺术家名称,值是元组列表(对此不确定(但不确定这是它的要求))
我遇到的主要问题是,我需要以某种方式跳过文件的第一行,并且不确定是否将其作为字典返回。 这是其中一个文件的示例:
"Artist","Title","Year","Total Height","Total Width","Media","Country"
"Pablo Picasso","Guernica","1937","349.0","776.0","oil paint","Spain"
"Vincent van Gogh","Cafe Terrace at Night","1888","81.0","65.5","oil paint","Netherlands"
"Leonardo da Vinci","Mona Lisa","1503","76.8","53.0","oil paint","France"
"Vincent van Gogh","Self-Portrait with Bandaged Ear","1889","51.0","45.0","oil paint","USA"
"Leonardo da Vinci","Portrait of Isabella d'Este","1499","63.0","46.0","chalk","France"
"Leonardo da Vinci","The Last Supper","1495","460.0","880.0","tempera","Italy"
因此,我需要阅读输入文件并将其转换为如下所示的字典:
sample_dict = {
"Pablo Picasso": [("Guernica", 1937, 349.0, 776.0, "oil paint", "Spain")],
"Leonardo da Vinci": [("Mona Lisa", 1503, 76.8, 53.0, "oil paint", "France"),
("Portrait of Isabella d'Este", 1499, 63.0, 46.0, "chalk", "France"),
("The Last Supper", 1495, 460.0, 880.0, "tempera", "Italy")],
"Vincent van Gogh": [("Cafe Terrace at Night", 1888, 81.0, 65.5, "oil paint", "Netherlands"),
("Self-Portrait with Bandaged Ear",1889, 51.0, 45.0, "oil paint", "USA")]
}
我遇到的主要问题是跳过显示“艺术家”,“标题”等的第一行,而只返回第一行之后的行。 我也不确定我当前的代码是否将其作为字典返回。 这是我到目前为止的
def convertLines(lines):
head = lines[0]
del lines[0]
infoDict = {}
for line in lines: #Going through everything but the first line
infoDict[line.split(",")[0]] = [tuple(line.split(",")[1:])]
return infoDict
def read_file(filename):
thefile = open(filename, "r")
lines = []
for i in thefile:
lines.append(i)
thefile.close()
mydict = convertLines(read_file(filename))
return lines
只需对我的代码进行几处小的更改即可返回正确的结果,或者我需要以其他方式进行处理? 看来我当前的代码读取了完整的文件,但是如果还没有,我将如何跳过第一行并可能以dict表示形式返回呢? 谢谢你的帮助
我们要做的第一件事是删除列表的第一行。
然后,我们运行一个函数以完全按照您说的做,创建一个以元组列表为值的字典。
您可以保留已有的功能,然后在lines变量上运行此操作。
好吧,运行以下代码,您应该会很好
def convertLines(lines):
head = lines[0]
del lines[0]
infoDict = {}
for line in lines: #Going through everything but the first line
infoDict[line.split(",")[0]] = [tuple(line.split(",")[1:])]
return infoDict
def read_file(filename):
thefile = open(filename, "r")
lines = []
for i in thefile:
lines.append(i)
thefile.close()
return lines
mydict = convertLines(read_file(filename))
print(mydict)
#Do what you want with mydict below this line
您应该尝试一下。 我觉得很简单
import csv
from collections import defaultdict
d_dict = defaultdict(list)
with open('file.txt') as f:
reader = csv.reader(f)
reader.next()
for i in list(reader):
d_dict[i[0]].append(tuple(i[1:]))
print dict(d_dict)
输出:
{
'Vincent van Gogh': [
('Cafe Terrace at Night', '1888', '81.0', '65.5', 'oil paint', 'Netherlands'),
('Self-Portrait with Bandaged Ear', '1889', '51.0', '45.0', 'oil paint', 'USA')
],
'Pablo Picasso': [
('Guernica', '1937', '349.0', '776.0', 'oil paint', 'Spain')
],
'Leonardo da Vinci': [
('Mona Lisa', '1503', '76.8', '53.0', 'oil paint', 'France'),
("Portrait of Isabella d'Este", '1499', '63.0', '46.0', 'chalk', 'France'),
('The Last Supper', '1495', '460.0', '880.0', 'tempera', 'Italy')
]
}
更好的方法是:
with open('filename','r,') as file: # Make a file object
items = []
_ = file.readline() # This will read the first line and store it in _
# a variable of no use.
for line in file: # Next we start the for loop to read all other
# data
item.append(line)
一旦执行了此代码,with语句将关闭文件对象。 因此,无需执行f.close()
csv模块提供了用于处理CSV文件的有用工具。 应该执行以下操作:
import csv
from collections import defaultdict
def read_file(filename):
with open(filename, 'r') as f:
reader = csv.DictReader(f, delimiter=',')
result_dict = defaultdict(list)
fields = ("Title", "Year", "Total Height", "Total Width", "Media", "Country")
for row in reader:
result_dict[row['Artist']].append(
tuple(row[field] for field in fields)
)
return dict(result_dict)
DictReader
使用文件第一行中的字段作为字段名称。 然后,它在文件中的行上返回一个可迭代的字段,这些字段是作为以字段名作为键的dicts
生成的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.