[英]How to assign multiple values to a key using dictionaries?
I have a csv file that contains the name (of a video game), platform, Genre, Publisher, etc. I am trying to create 3 separate dictionaries.我有一个 csv 文件,其中包含(视频游戏的)名称、平台、流派、出版商等。我正在尝试创建 3 个单独的字典。 Dictionary one was easy since the key used was the title of a video game which is unique.
字典一很容易,因为使用的关键是一个独特的视频游戏的标题。
For the 2nd and 3rd dictionary, I am having issues since the keys "Genre"
and "Publisher"
are not unique.对于第二和第三本词典,我遇到了问题,因为键
"Genre"
和"Publisher"
不是唯一的。 I am trying to have D2
look like:我试图让
D2
看起来像:
D2 = { 'Puzzle' : [(tup2),(tup2], 'Another genre': [(tup2)]...}
Since there are multiple games that have the same genre.因为有多个游戏具有相同的类型。
import csv
fp = open("video_game_sales_tiny.csv", 'r')
fp.readline()
reader = csv.reader(fp)
D1 = {}
D2 = {}
D3 = {}
for line in reader:
name = line[0].lower().strip()
platform = line[1].lower().strip()
if line[2] in (None, 'N/A'):
pass
else:
year = int(line[2])
genre = line[3].lower().strip()
publisher = line[4]
na_sales = float(line[5])
europe_sales = float(line[6])*1000000
japan_sales = float(line[7])*1000000
other_sales = float(line[8])*1000000
global_sales = (europe_sales + japan_sales + other_sales)
tup = (name,platform, year,genre, publisher, global_sales)
tup2 = (genre, year, na_sales, europe_sales, japan_sales, other_sales, global_sales)
tup3 = (publisher, name, year, na_sales, europe_sales, japan_sales, other_sales, global_sales)
D1[name] = tup
D2[genre] = tup2
D3[publisher] = tup3
print(D1)
print(D2)
print(D3)
You should create the entry for genre
(for instance) as a list, and then append to the list.您应该将
genre
(例如)条目创建为一个列表,然后将 append 创建到列表中。
if not genre in D2:
D2[genre] = []
D2[genre].append(tup2)
You have a problem with non-unique keys.你有非唯一键的问题。
If that problem is corrected (you need unique keys), the merge(
) method can be used with any other how options (left, right, inner, ...).如果该问题得到纠正(您需要唯一键),
merge(
) 方法可以与任何其他方式选项(左、右、内...)一起使用。
The Pandas Library merge()
method is very powerful and will solve your problem. Pandas 库
merge()
方法非常强大,可以解决您的问题。
But, you need to do something about non-unique keys problem.但是,您需要对非唯一键问题做一些事情。
I suggest use the method unique()
and make your own list of indexes for each DataFrame
.我建议使用方法
unique()
并为每个DataFrame
创建自己的索引列表。 This will be only one more layer in your ETL process.这只是 ETL 流程中的一层。
Suppose you have two DataFrames
: df_a
and df_b
.假设您有两个
DataFrames
: df_a
和df_b
。 This dataframes share a unique key called u_key
.此数据帧共享一个名为
u_key
的唯一键。
The merge process with these dataframes will be something like:与这些数据框的合并过程将类似于:
import pandas as pd
...
left_merge = pd.merge(df_a, df_b, on=["u_key"], how="left")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.