简体   繁体   中英

Python print .psl format without quotes and commas

I am working on a linux system using python3 with a file in .psl format common to genetics. This is a tab separated file that contains some cells with comma separated values. An small example file with some of the features of a .psl is below.

input.psl

1 2 3 x read1 8,9, 2001,2002,
1 2 3 mt read2 8,9,10 3001,3002,3003
1 2 3 9 read3 8,9,10,11 4001,4002,4003,4004
1 2 3 9 read4 8,9,10,11 4001,4002,4003,4004

I need to filter this file to extract only regions of interest. Here, I extract only rows with a value of 9 in the fourth column.

import csv

def read_psl_transcripts():
    psl_transcripts = []
    with open("input.psl") as input_psl:
        csv_reader = csv.reader(input_psl, delimiter='\t')
        for line in input_psl:
        #Extract only rows matching chromosome of interest
        if '9' == line[3]:
            psl_transcripts.append(line)
    return psl_transcripts

I then need to be able to print or write these selected lines in a tab delimited format matching the format of the input file with no additional quotes or commas added. I cant seem to get this part right and additional brackets, quotes and commas are always added. Below is an attempt using print().

outF = open("output.psl", "w")
for line in read_psl_transcripts():
    print(str(line).strip('"\''), sep='\t')

Any help is much appreciated. Below is the desired output.

1 2 3 9 read3 8,9,10,11 4001,4002,4003,4004
1 2 3 9 read4 8,9,10,11 4001,4002,4003,4004

You might be able to solve you problem with a simple awk statement.

awk '$4 == 9' input.pls > output.pls

But with python you could solve it like this:

write_pls = open("output.pls", "w")

with open("input.pls") as file:
    for line in file:
        splitted_line = line.split()
        if splitted_line[3] == '9':
            out_line = '\t'.join(splitted_line)
            write_pls.write(out_line + "\n")

write_pls.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM