Python-解析一行文字

Question

我從文本文件輸入以下內容：

Title Value Position Perturbation 1.5 0.6 8.5 9.8 0 8.5 9.6 0.5 0.6 (...)

Title Value Position Perturbation 3 1.5 6 0 0.8 9.7 5.3 9.9 0.7 0.9 (...)

我想刪除前4列，並刪除帶有數字的列，我想每4個值對子集進行一次更改，並更改第二個值的第3個值的位置並刪除第四個值，因此，輸出應如下所示：

1.5 8.5 0.6 0 9.6 8.5 0.6 (...)
3 6 1.5 0.8 5.3 9.7 0.7 (...)

為此，我編寫了以下Python代碼：

import sys

input_file= open (sys.argv[1],'r')
output_file= open (sys.argv[2], 'w')
with open(sys.argv[1]) as input_file:
for i, line in enumerate(input_file):
        output_file.write ('\n')
        marker_info= line.split()
        #snp= marker_info[0]
        end= len(marker_info)   
        x=4
        y=8
        # while y<=len(marker_info):
        while x<=end:
            intensities= marker_info[x:y]
            AA= intensities[0]
            BB= intensities[1]
            AB= intensities[2]
            NN= intensities[3]
            output_file.write ('%s' '\t' '%s' '\t' '%s' '\t' % (AA, AB, BB))
            x= y 
            y= x + 4
input_file.close()
output_file.close()

該代碼似乎可以正常工作，但問題是每一行都缺少最后四個值。 因此，我想問題出在“ while”語句中……但是我不知道如何解決（我知道這似乎是一個簡單的問題）。

在此先感謝您的任何建議。

Answer 1

嘗試這個：
1.打開csv之類的文件並剝離標簽
2.生成所需大小的子列表
3.進行交換並刪除尾隨元素
4.保存輸出（我已經完成了一個列表，但是可以使用輸出文件來完成）

>>> import csv
>>> output = []
>>> with open('sample.csv') as input:
...     reader = csv.reader(input, delimiter=' ')
...     for line in reader:
...         line = line[4:] #strip labels
...         slice_size = 4
...         for slice_idx in range(0,len(line),slice_size):
...             sublist = line[slice_idx : slice_idx+slice_size]
...             if len(sublist) == slice_size:
...                 swap = sublist[2]
...                 sublist[2] = sublist[1]
...                 sublist[1] = swap
...                 output.append(sublist[:slice_size-1])
... 
>>> 
>>> output
[['1.5', '8.5', '0.6'], ['0', '9.6', '8.5'], ['3', '6', '1.5'], ['0.8', '5.3', '9.7']]

Answer 2

試試這個，它全部基於您的腳本，除了while表達式和打開文件方法。 輸入文件：

Title Value Position Perturbation 1.5 0.6 8.5 9.8 0 8.5 9.6 0.5 0.6 1.1 2.2 3.3
Title Value Position Perturbation 3 1.5 6 0 0.8 9.7 5.3 9.9 0.7 0.9 1.1 2.2
Title Value Position Perturbation 3.1 2.5 1.6 0 1.8 2.7 4.3 6.9 3.7 1.9 2.1 3.2

劇本：

with open("parser.txt", "r") as input_file, open("output_parser.txt","w") as output_file:
    for i, line in enumerate(input_file):
        output_file.write ('\n')
        marker_info= line.split()
        end= len(marker_info)
        x=4
        y=8

        while y<=end: #x<=end:
            intensities= marker_info[x:y]
            AA= intensities[0]
            BB= intensities[1]
            AB= intensities[2]
            NN= intensities[3]
            output_file.write ('%s' '\t' '%s' '\t' '%s' '\t' % (AA, AB, BB))
            print end, x, y, marker_info[x:y], AA, AB, BB

            x= y 
            y= x + 4

輸出：

1.5 8.5 0.6 0   9.6 8.5 0.6 2.2 1.1 
3   6   1.5 0.8 5.3 9.7 0.7 1.1 0.9 
3.1 1.6 2.5 1.8 4.3 2.7 3.7 2.1 1.9

Python-解析一行文字

問題描述

2 個解決方案

解決方案1
2 2014-10-08 08:12:26

解決方案2
0 已采納 2014-10-08 09:34:24

Python-解析一行文字

問題描述

2 個解決方案

解決方案1 2 2014-10-08 08:12:26

解決方案2 0 已采納 2014-10-08 09:34:24

解決方案1
2 2014-10-08 08:12:26

解決方案2
0 已采納 2014-10-08 09:34:24