简体   繁体   English

使用python将csv文件中的特定列附加到另一列

[英]append a particular column from a csv file to another using python

I'll explain my whole problem: 我将解释我的整个问题:
I have 2 csv files: 我有2个csv文件:

  • project-table.csv (has about 50 columns) project-table.csv(约50列)
  • interaction-matrix.csv (has about 45 columns) interact-matrix.csv(约45列)

I want to append the string in col[43] from project-table.csv with string in col[1] of interaction-matrix.csv with a dot( . ) in between both the strings 我想将project-table.csv的col[43]的字符串附加到interacting-matrix.csv的col[1]中的字符串中,并将两个字符串之间的点号( .

next, 下一个,

  • interaction-matrix.csv has a set of headers.. interact-matrix.csv具有一组标题。
  • its 1st col will now have the appended string after doing what I've mentioned above 在完成我上面提到的操作后,其第一列现在将具有附加的字符串
  • all other remaining columns have only 0's and 1's 所有其他其余列只有0和1
  • I'm supposed to extract only those columns with 1's from this interaction-matrix.csv and copy it to a new csv file... (with the first column intact) 我应该从该interact-matrix.csv中仅提取那些带有1的列,并将其复制到新的csv文件中...(第一列完好无损)

this is the code i ve come up with... 这是我想出的代码...

I'm getting an error with the keepcols line... 我在keepcols行中遇到了错误...

import csv
reader=csv.reader(open("project-table.csv","r"))
writer=csv.writer(open("output.csv","w"),delimiter=" ")
for data in reader:
        name1=data[1].strip()+'.'+data[43].strip()
        writer.writerow((name1, None))


reader=csv.DictReader(open("interaction-matrix.csv","r"),[])
allrows = list(reader)
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]

print keepcols
writer=csv.DictWriter(open("output1.csv","w"),fieldnames='keepcols',extrasaction='ignore')
writer.writerows(allrows)

this is the error i get: 这是我得到的错误:

Traceback (most recent call last):
  File "prg1.py", line 23, in ?
    keepcols = [c for c in allrows[0] if all([r[c] != '0' for r in allrows])]
NameError: name 'all' is not defined

project table and interaction-matrix both have the same data in their respective 1st columns .. so i just appended col[43] of prj-table to col[1] of the same table itself... 项目表和交互矩阵在各自的第一列中都具有相同的数据..因此,我只是将prj-table的col [43]附加到同一表本身的col [1] ...

Edit your question to show what error message are you getting. 编辑您的问题以显示您得到什么错误消息。 Update: NameError probably means you are using an (older) version of Python (which one?) without all() or (you have used all as a variable name AND are not showing the exact code that you ran) 更新:NameError可能意味着您正在使用Python的(旧版本)(哪个版本?)而没有all()或(您将all用作变量名并且没有显示您运行的确切代码)

Note: open both files in binary mode ("rb" and "wb") respectively. 注意:分别以二进制模式(“ rb”和“ wb”)打开两个文件。

You say "I want to append the string in col[43] from project-table.csv with string in col[1] of interaction-matrix.csv with a dot(.) in between both the strings" HOWEVER you are using col[2] (not col[1]) of project-table.csv (not interaction-matrix.csv, which you haven't opened at that stage). 您说“我想从project-table.csv的col [43]中的字符串附加到interacting-matrix.csv的col [1]中的字符串,并在两个字符串之间加上点号(。)”,但是您正在使用col project-table.csv的[2](不是col [1])(不是该阶段尚未打开的interact-matrix.csv)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM