[英]How to make a dictionary in python from a text file seperated by paragraph with the key in the first line of each new paragraph?
[英]How to make a dictionary from text file using First row as the key
我正在尝试从文本文件到字典进行 COVID 监控,以检查每个国家/地区的病例。 我想以这种格式制作字典。
covid ={country{confirmed:value, active:value, recovered:value, suspect:value, probable:value,
deceased:value}
基于此文本文件。
COUNTRY, CONFIRMED, ACTIVE, RECOVERED, SUSPECT, PROBABLE, DECEASED
COUNTRY-A,3,4,2,1,0,0
COUNTRY-B,1,2,0,2,0,0
COUNTRY-C,4,2,0,0,3,0
COUNTRY-D,1,1,1,3,0,0
COUNTRY-E,3,2,0,2,0,0
COUNTRY-F,2,0,1,2,0,0
COUNTRY-G,0,0,1,4,0,0
我尝试了这段代码,但它打印了国家、确认、活动、恢复、怀疑、可能、死亡,并且每次我计算每个国家/地区的病例总数时都会给我一个错误。
我试过这段代码:
def covid_monitoring():
country = []
cov_dict = {}
no_cases = []
with open("covidmonitor.txt", 'r') as f:
for cov in f:
cov = cov.strip()
next(f) # skip header
if len(cov) >= 1:
cov_line = cov.split(",")
country.append(cov_line[0].strip())
confirmed_file = cov_line[0].strip()
active_file = cov_line[1].strip()
recovered_file = cov_line[2].rstrip()
suspected_file = cov_line[3].strip()
probable_file = cov_line[4].strip()
probable_file = cov_line[5].strip()
deceased_file = cov_line[6].strip(';')
if confirmed_file not in cov_dict:
cov_dict[confirmed_file] = [(active_file, recovered_file, suspected_file, probable_file, probable_file, deceased_file)]
else:
cov_dict[confirmed_file].append((active_file, recovered_file, suspected_file, probable_file, probable_file, deceased_file))
# print(cov_dict)
for cntry in country:
if cntry in cov_dict:
for confirm, active, recovered, suspect, probable, deceased in cov_dict[cntry]:
print("\tCOUNTRY:{cntry}")
print("\tCONFIRMED:{confirm} ")
print("\tACTIVE:{active} ")
print("\tRECOVERED:{recovered} ")
print("\tSUSPECTED:{suspect} ")
print("\tPROPBABLE:{probable} ")
print("\tDECEASED:{deceased} ")
total_count = int(confirm) + int(active) + int(recovered) + int(suspect) + int(probable) + int(deceased)
no_cases.apped(total_count)
print(sum(no_cases)
这是我的错误:
total_count = int(confirm) + int(active) + int(recovered) + int(suspect) + int(probable) + int(deceased)
ValueError: invalid literal for int() with base 10: 'CONFIRMED'
如果你想跳过 header,不要在每个循环中调用 next。
with open("covidmonitor.txt", 'r') as f:
# f.readlines()[1:] read all line except first line
for cov in f.readlines()[1:]:
cov = cov.strip()
它似乎是一个 csv 文件。 您也可以像这样在 python 中使用 csv package。
import csv
no_cases= []
country= []
cov_dict = {}
with open("covidmonitor.txt", 'r') as f:
cov = csv.DictReader(f, delimiter=",", skipinitialspace=True)
for country_data in cov:
total_count = [float(data) for key, data in c.items() if key != 'COUNTRY']
no_cases.append(sum(total_count))
country.append(country_data['COUNTRY'])
cov_dict[country_data['COUNTRY']] = total_count
从代码来看, no_cases
是一个字符串列表,因为confirm
、 active
、 recovered
、 suspect
、 probable
、 deceased
都是字符串并且
total_count= (confirm + active + recovered + suspect + probable + deceased)
也是一个字符串,它是连接而不是总和。
在代码的最后一行对字符串列表调用sum()
应该会产生如下错误:
TypeError: unsupported operand type(s) for +: 'int' and 'str'
如果要将它们视为数字,则应将所有这些转换为整数。
您的代码也存在其他问题。 例如,您调用next(f)
以跳过 header,但实际上您在文件中的for
循环中执行此操作,因此它可能会跳过每隔一行。
您可以为此使用 Python pandas package:
import pandas as pd
data = pd.read_csv("COVID19.TXT", sep=",")
covid = data.set_index("COUNTRY").T.to_dict()
print(covid)
此外,还有一个分析建议,您可以在 pandas dataframe 上轻松计算各种函数,然后在需要时转换为 dict。 在此处查看此链接以获取更多详细信息: Pandas 教程
如果您需要更多帮助,请在评论中告诉我。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.