简体   繁体   English

将特定列从csv文件复制到特定位置的另一个csv

[英]Copying a specific column from a csv file to another csv in a specific place

I have tried different methods in Python 2.7 that I saw on this forum for copying a specific column from a csv file to another csv file in a specific place(column). 我在该论坛上看到的Python 2.7中尝试了不同的方法,用于将特定列从csv文件复制到特定位置(列)的另一个csv文件。

csv1:
Header1 Header2 Header3 Header4
1       2       3       4
1       2       3       4
1       2       3       4
1       2       3       4

csv2:
Header5 Header6 Header7
5       6       7
5       6       7
5       6       7

So I want to copy the column Header2 over the column Header6 resulting the following 所以我想将列Header2复制到列Header6上,结果如下

csv2:
Header5 Header2 Header7
5       2       7
5       2       7
5       2       7
        2

Every header in in a different cell. 每个标头位于不同的单元格中。 I have tried the following(even making a third file) but did not succeeded: 我尝试了以下操作(甚至制作了第三个文件),但没有成功:

with open('book1.csv', 'r') as book1:
    with open('book2.csv', 'r') as book2:
        reader1 = csv.reader(book1, delimiter=',')
        reader2 = csv.reader(book2, delimiter=',')

        both = []
        fields = reader1.next() # read header row
        reader2.next() # read and ignore header row
        for row1, row2 in zip(reader1, reader2):
            row2.append(row1[-1])
            both.append(row2)

        with open('output.csv', 'w') as output:
            writer = csv.writer(output, delimiter=',')
            writer.writerow(fields) # write a header row
            writer.writerows(both)

Any ideas? 有任何想法吗? :) :)

The lists that you are appending get appended horizontally. 您要追加的列表会水平追加。 That row has no way of knowing whether or not the next item to be appended to it belongs to the adjacent column or multiple columns over. 该行无法知道要添加到该行的下一个项目是否属于相邻列或上方的多个列。

The way around this is to identify the length of your Header Column with the most values ( Maximum Column Length among all Columns. ) 解决此问题的方法是识别具有最多值的标题列的长度(所有列中的最大列长度)。

In your desired "csv2" output, Header2 has the highest number of values in its column ( 4 values ) compared to the other Headers ( 3 values ). 在所需的“ csv2”输出中,Header2在其列中具有最高数量的值(4个值),而其他Headers(3个值)则最多。

What you want to do is make sure all the other headers have a length equal to the maximum length ( 4 values ). 您要做的是确保所有其他标头的长度等于最大长度(4个值)。

You can continually add an irrelevant item to each column so that it spaces out perfectly for the next column. 您可以连续地向每列添加不相关的项,以便为下一列完美地隔开。 Example items you can append vertically to the short columns at the bottom can be an Empty String Value ( "" ) , a Not-Applicable String Value ( "NA" ), or a number like 0 that you don't expect to show up as an integer in any of your data set columns. 您可以垂直添加到底部短栏的示例项目可以是空字符串值(“”),不适用的字符串值(“ NA”)或不希望显示的类似0的数字作为任何数据集列中的整数。

try something like: 尝试类似的东西:

        for row1, row2 in zip(reader1, reader2):
            newRow = str(row1[0])+","+str(row1[1])+","+str(row1[2])
            both.append(row2)

Also I would suggest you not to copy some other code and paste it as your solution. 我也建议您不要复制其他代码并将其粘贴为解决方案。 I would suggest you to try atleast executing a part of code and ask for help. 我建议您尝试至少执行一部分代码并寻求帮助。 It is fine if you do not add any code too but if you do not try the code you gave it might confuse people and they can't help you. 如果您也不要添加任何代码,则很好,但是,如果您不尝试输入的代码,可能会使人们感到困惑,并且他们无法为您提供帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM