如何使用一个主键将不同的 CSV 文件合并到一个新的 CSV 文件中

Question

I have two huge CSV file and want them to join in one new CSV file with using python pandas, the primary key is id_student, it is ok that I successfully join different column together but when I output to a new CSV file, the whole bunch of data will only exist to the first row, different column, for example, the row 1 column 1 will be id_student, it is like:我有两个巨大的 CSV 文件，并希望它们使用 python pandas 加入一个新的 CSV 文件，主键是 id_student，我可以成功地将不同的列连接在一起，但是当我输出到一个新的 CSV 文件时，整个文件数据将只存在于第一行，不同的列，例如，第1行第1列将是id_student，就像：

0  12345
1  12344

then row 1 column will be final_result, the format will like:那么第 1 列将是 final_result，格式如下：

0  Pass
1  Pass

but my expected output will be like :但我的预期输出将是：

0  12345 Pass
1  12344 Pass

Is there any way I can fix the output format?有什么办法可以修复输出格式吗？

def plotlyGraph(self):

    df = pandas.read_csv('studentAssessment.csv')
    dc = pandas.read_csv('studentInfo.csv')
    res = pandas.merge(df,dc, on=['id_student'], how='outer')
    a=res['id_student']
    b=res['final_result']
    c=res['score']
    d=res['id_assessment']
    e=res['region']

    with open("new.csv", "w", newline="") as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow([a,b,c,d,e])

Answer 1

I am assuming your df has 2 columns: id_student and id_assessment , while the dc has 2 columns: id_student and final_result .我假设您的df有 2 列： id_student和id_assessment ，而dc有 2 列： id_student和final_result 。 Try this one:试试这个：

df = pandas.read_csv('studentAssessment.csv')
dc = pandas.read_csv('studentInfo.csv')

res = df.merge(dc, on=['id_student'], how='outer')
print(res)

Output输出

   id_student id_assessment final_result
0           0       12345          pass
1           1       12344          pass

To store in csv file:要存储在csv文件中：

res.to_csv("new.csv", index=False)

如何使用一个主键将不同的 CSV 文件合并到一个新的 CSV 文件中

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-04-19 02:31:55

如何使用一个主键将不同的 CSV 文件合并到一个新的 CSV 文件中

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-04-19 02:31:55

解决方案1
1 已采纳 2019-04-19 02:31:55