简体   繁体   English

Python将单列数据转换为多列

[英]Python convert single column of data into multiple columns

I have a .txt file with simple numerical data in it. 我有一个.txt文件,其中包含简单的数字数据。 The data reflect multiple measurements of the same thing, and are simply written out in a long column. 数据反映了对同一事物的多次测量,并且只需将其写成一长列即可。 I want a script to read through the file, recognise the delimiter separating one experiment from the next and write it all out to separate columns in a txt or csv file. 我想让脚本读取文件,识别分隔符,将一个实验与下一个实验分开,并将其全部写到txt或csv文件中的单独列中。

At the moment, the data is delimited by the flag ' # row = X ' where X = 0 to ~128. 目前,数据由标志'#row = X'界定,其中X = 0至〜128。 So I want a script that will open the file, read up to 'row = 0', and then copy the next ~1030 lines of data to some list/array as "column 0". 因此,我想要一个脚本来打开文件,读取到“行= 0”,然后将接下来的〜1030行数据复制为“列0”到某个列表/数组。 Then when it hits 'row = 1', copy the next ~1030 lines of numbers to "column 1'...and so on. Then it should write it out as multiple columns. The input data file looks like this: 然后,当它达到'row = 1'时,将接下来的〜1030行数字复制到“ column 1'...等等。然后将其写为多列。输入数据文件如下所示:

# row = 0
9501.7734375
9279.390625
[..and so on for about 1030 lines...]
8836.5
8615.1640625
# row = 1
4396.1953125
4197.1796875
[..and so on for about 1030 lines...]
3994.4296875
# row = 2
9088.046875
8680.6953125
[..and so on for about 1030 lines...]
8253.0546875

The final file should look something like this: 最终文件应如下所示:

row0          row1         row2       row3
9501.7734375  4396.1953125 etc        etc
9279.390625   4197.1796875
[..snip...]   [...snip...]
8836.5        3994.4296875
8615.1640625  3994.4347453

Preferably python as I have some experience from some years ago! 最好是python,因为我几年前有一些经验! Thanks everyone, Jon 谢谢大家,乔恩

from io import StringIO
from collections import OrderedDict

datastring = StringIO(u"""\
# row = 0
9501.7734375
9279.390625
8615.1640625
# row = 1
4396.1953125
4197.1796875
3994.4296875
# row = 2
9088.046875
8680.6953125
8253.0546875
""")      

content = datastring.readlines()
out = OrderedDict()
final = []

for line in content:
    if line.startswith('# row'):
        header = line.strip('\n#')
        out[header] = []
    elif line not in out[header]:
        out[header].append(line.strip('\n'))


for k, v in out.iteritems():
    temp = (k + ',' + ','.join([str(item) for item in v])).split(',')
    final.append(temp)

final = zip(*final)
with open("C:/temp/output.csv", 'w') as fout:
    for item in final:
    fout.write('\t'.join([str(i) for i in item]))

Output: 输出:

 row = 0         row = 1        row = 2
9501.7734375    4396.1953125    9088.046875
9279.390625     4197.1796875    8680.6953125
8615.1640625    3994.4296875    8253.0546875

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM