[英]Extract columns from 3 CSVs in to 1 CSV (Python)
I'm looking for a way to combine 3 CSVs that contain information in the columns D and E like this with the first row called X Days:我正在寻找一种方法将包含 D 和 E 列中的信息的 3 个 CSV 与第一行称为 X 天:
D E F G
1 3 Days 4 Days
2 $100 $200
3 $111 $222
4 ... ...
5 ... ...
I want to combine the 3 CSVs in to one but leave a column blank between them like this:我想将 3 个 CSV 合并为一个,但在它们之间留一列空白,如下所示:
Data from 1st CSV: Data from 2nd CSV: Data from 3nd CSV:
D E F G H I J K
1 3 Days 4 Days 3 Days 4 Days 3 Days 4 Days
2 $100 $200 $300 $400 $500 $600
3 $111 $222 $333 $444 $555 $666
4 ... ... ... ... ... ...
5 ... ... ... ... ... ...
How can I combine them like this?我怎样才能像这样组合它们? (Without the "Data from X CSV")
(没有“来自 X CSV 的数据”)
So, you just open all of your CSV files in parallel.因此,您只需并行打开所有 CSV 文件。 As long as they keep feeding input, you create new rows on output.
只要他们继续提供输入,您就可以在 output 上创建新行。 This code stops as soon as the FIRST CSV file ends.
只要 FIRST CSV 文件结束,此代码就会停止。 If you need to run until the longest file, that will take more work.
如果您需要运行到最长的文件,那将需要更多的工作。 You'd use this like:
你会这样使用:
python merge.py aaa.csv bbb.csv ccc.csv
You could use glob to allow wildcards.您可以使用 glob 来允许使用通配符。
import sys
import csv
fls = [csv.reader(open(f)) for f in sys.argv[1:]]
fout = csv.writer(open('out.csv','w'))
try:
while True:
newrow = []
for i,f in enumerate(fls):
row = next(f)
if i == 0:
newrow.extend( row[0:5] + [''] )
else:
newrow.extend( row[3:5] + [''] )
fout.writerow( newrow )
print(newrow)
except StopIteration:
print('Done')
read all three csvs in to dataframes and then use pd.concat(...,axis=1)
to concat them as new columns.将所有三个 csv 读入数据帧,然后使用
pd.concat(...,axis=1)
将它们连接为新列。
df1 = pd.read_csv(file1)
df2 = pd.read_csv(file2)
df3 = pd.read_csv(file3)
df1['F'] = '' #making those blank columns
df2['I'] = ''
df_final = pd.concat([df1,df2,df3],axis=1)
df_final.to_csv(filename)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.