Python：如何根据对 CSV 文件的列标题的查找来设置 class 的变量

Question

I have a class ETF that has many variables.我有一个 class ETF ，它有很多变量。 I just included three below for simplicity but there are actually close to 40:为简单起见，我只在下面列出了三个，但实际上有将近 40 个：

class ETF:
    def __init__(self, symbol, name, asset_class):
        self.symbol = symbol
        self.name = name
        self.asset_class = asset_class

There is another file in my project with the following code.我的项目中还有另一个文件，其中包含以下代码。 The two #CODE NEEDED HERE comments are where my question pertains to.两条#CODE NEEDED HERE评论是我的问题所在。

import csv

# Open the file
data = open('db.csv')
csv_data = csv.reader(data) # csv.reader

# reformat it into a python object list of lists
data_lines = list(csv_data)

headers = data_lines[1] # Retrieving the column headers

# Find the Index positions in headers for each ETF class attribute
#CODE NEEDED HERE

# create ETF objects for each line in the file
for line in data_lines[2:]:
    # CODE NEEDED HERE
    # Lookup the column header based on the

I also have two spreadsheets.我还有两个电子表格。 One spreadsheet is called db.csv and contains the information we will be using to create ETF objects.一个名为db.csv的电子表格包含我们将用于创建ETF对象的信息。 Each row in this csv will be it's own ETF object. The column headers on the CSV file do do not exactly match the variable names in the ETF class and not every column is used.此 csv 中的每一行都将是它自己的ETF object。CSV 文件中的列标题与ETF class 中的变量名称不完全匹配，并且并非每一列都被使用。 For that reason, I have a second spreadsheet called column_reference.csv which I will use to map the column names in db.csv to the ETF variable names.出于这个原因，我有第二个电子表格，名为column_reference.csv ，我将使用它来将 db.csv 中的列名称db.csv用于ETF变量名称。

See table below for an example of the column_reference.csv file:有关column_reference.csv文件的示例，请参见下表：

Please see the image below as an example of the db.csv file:请参阅下图作为db.csv文件的示例：

What code would you use to most efficiently map the column headers and create ETF objects.您将使用什么代码最有效地 map 列标题和创建 ETF 对象。

Answer 1

Use pandas to create a dataframe out of the csv and df.iterrows() to iterate over the rows and initialize objects by them.使用pandas从 csv 和df.iterrows()中创建一个 dataframe 来迭代行并通过它们初始化对象。 By manipulating the df.columns attribute you can set your custom column names.通过操作df.columns属性，您可以设置自定义列名。

Answer 2

This is the "Pythonic way":这是“Pythonic 方式”：

columns = open('column_reference.csv')
csv_columns = csv.reader(columns) 

columns_dict = {}

for column in csv_columns:
    columns_dict[column[0]] = column[1]

for line in data_lines[2:]:
    values = {}
    for key in columns_dict.keys():
        p_index = headers.index(key)
        values[key] = line[p_index]
        ETF(**values)

Answer 3

I ended up using a series of nested for loops to create lists of each CSV row to accomplish this in the shortest amount of time possible.我最终使用了一系列嵌套的 for 循环来创建每个 CSV 行的列表，以在尽可能短的时间内完成此操作。 The pandas solution was too time consuming pandas解决太费时间

import csv
from ETF import ETF


# Open the file
data = open('db.csv')
csv_data = csv.reader(data) # csv.reader

# reformat it into a python object list of lists
data_lines = list(csv_data)
print(type(data_lines))


# Creating a hash map of the column_reference.csv file
name_map = []
with open('column_reference.csv') as f:
    for line in f:
        tokens = line.split(',')
        old = tokens[0]
        new = tokens[1]
        name_map.append([old, new])

# Retrieving the column headers of the database file
counter = -1
for i in data_lines[1]:
    counter = counter + 1
    for j in name_map:
        if j[0] == i:
            j.append(counter)

# Creating ETF objects based on the indexes of the columns in the database
for line in data_lines[2:]:
    # Lookup the column header based on the
    etf_characteristics = []
    for i in name_map:
        etf_characteristics.append(line[i[2]])
    this_etf = ETF(*etf_characteristics)

Python：如何根据对 CSV 文件的列标题的查找来设置 class 的变量

问题描述

3 个解决方案

解决方案1
0 2020-09-21 19:53:55

解决方案2
0 2020-09-21 19:59:12

解决方案3
0 已采纳 2020-09-21 23:11:51

Python：如何根据对 CSV 文件的列标题的查找来设置 class 的变量

问题描述

3 个解决方案

解决方案1 0 2020-09-21 19:53:55

解决方案2 0 2020-09-21 19:59:12

解决方案3 0 已采纳 2020-09-21 23:11:51

解决方案1
0 2020-09-21 19:53:55

解决方案2
0 2020-09-21 19:59:12

解决方案3
0 已采纳 2020-09-21 23:11:51