[英]Transposing the data from csv file to csv using python or Matlab
I am working on data having four columns and 912500 rows in csv format.我正在处理 csv 格式的具有四列和 912500 行的数据。 I need to transpose the data in each column to 365 columns and 2500 rows in separate csv file.
我需要将每列中的数据转换为单独的 csv 文件中的 365 列和 2500 行。 eg.
例如。
Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
1 33 36 38 1 33 36 38
2 25 18 56 2 25 18 56
365 -4 -3 10 365 -4 -3 10
366 -11 20 35 366 -11 20 35
367 12 18 27 . 367 12 18 27 . .
.
730 26 36 27 . 730 26 36 27。 .
.
. . 912500 20 37 42
912500 20 37 42
Desired output期望输出
Col1 Col2 Col3 Col4 Col5 .....Col 365
1 33 25...........................-4 1 33 25................................-4
2 -11 12 ....................... 26 2 -11 12 ..................... 26
3 3
4............. 4………………
5............ . 5……………… .
.
2500............................ 2500 ......................
Please do advise me how to write a script for this?请告诉我如何为此编写脚本? Any help will be highly appreciated.
任何帮助将不胜感激。
Try using NumPy as suggested in the comments, but, just in case you want to code it yourself, here's one approach you could take:尝试按照评论中的建议使用 NumPy,但是,如果您想自己编写代码,您可以采用以下一种方法:
You can read the file one line at a time您可以一次读取一行文件
Split each line using the comma as the separator使用逗号作为分隔符分割每一行
Discard the "row count" (first element of the list you get as a result of the split operation).丢弃“行数”(由于拆分操作而获得的列表的第一个元素)。 You will have to maintain your own row count.
您必须维护自己的行数。
csv.reader
will create an iterator that reads the csv row by row. csv.reader
将创建一个迭代器,逐行读取 csv。 You can then feed that into itertools.chain
which iterates each row in turn, outputing individual columns.然后你可以将它输入到
itertools.chain
,它依次迭代每一行,输出单独的列。 Now that you have a stream of columns, you can group them into new rows of the size you want.现在您有了一个列流,您可以将它们分组为所需大小的新行。 There are several ways to rebuild those rows and I used
itertools.groupby
in my example.有几种方法可以重建这些行,我在示例中使用了
itertools.groupby
。
import itertools
import csv
def groupby_count(iterable, count):
counter = itertools.count()
for _, grp in itertools.groupby(iterable, lambda _: next(counter)//count):
yield tuple(grp)
def reshape_csv(in_filename, out_filename, colsize):
with open(in_filename) as infile, open(out_filename, 'w') as outfile:
reader = csv.reader(infile, delimiter=' ')
writer = csv.writer(outfile, delimiter=' ')
col_iter = itertools.chain.from_iterable(reader)
writer.writerows(groupby_count(col_iter, colsize))
And here's an example script to test.这是一个要测试的示例脚本。 I used fewer columns, though:
不过,我使用了较少的列:
import os
infn = "intest.csv"
outfn = "outtest.csv"
orig_colsize = 4
new_colsize = 15
# test input file
with open(infn, "w") as infp:
for i in range(32):
infp.write(' '.join('c{0:02d}_{1:02d}'.format(i,j) for j in range(4)) + '\n')
# remove stale output file
try:
os.remove(outfn)
except OSError:
pass
# run it and print
reshape_csv(infn, outfn, new_colsize)
print('------- test output ----------')
print(open(outfn).read())
What follows is tested against a fake data file, it worked OK for me but ymmv... please see the inline comments for a description of the workings以下内容针对虚假数据文件进行了测试,它对我来说工作正常,但是 ymmv...请参阅内嵌注释以了解工作原理
import csv
# we open the data file and put its content in data, that is a list of lists
with open('data.csv') as csvfile:
data = [row for row in csv.reader(csvfile)]
# the following idiom transpose a list of lists
transpose = zip(*data)
# I use Python 3, hence zip is a generator and I have to throw away using next()
# the first element, i.e., the column of the row numbers
next(transpose)
# I enumerate transpose, obtaininig the data column by column
for nc, column in enumerate(transpose):
# I prepare for writing to a csv file
with open('trans%d.csv'%nc, 'w') as outfile:
writer = csv.writer(outfile)
# here, we have an idiom, sort of..., please see
# http://stupidpythonideas.blogspot.it/2013/08/how-grouper-works.html
# for the reason why what we enumerate are the rows of your output file
for nr, row in enumerate(zip(*[iter(column)]*365)):
writer.writerow([nr+1,*row])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.