如何在Python上使用字典从文件更改值

Question

I'm doing a biology degree and feel like I've been thrown in at the deep end with python, as I've never coded before, and the 'teaching' was pretty much non-existent. 我正在攻读生物学学位，并且觉得自己被python深深地吸引住了，因为我以前从未编码过，而且“教学”几乎不存在。 Anyway, they've given this file of gene sequences, which pretty much looks like : 无论如何，他们给了这个基因序列文件，看起来像：

En123, ATGCCGAATA

En124, ATGCCAGTAT

but much longer with way more genes. 但是随着更多基因的出现，时间会更长。 They want it converted into a protein sequence. 他们希望将其转换为蛋白质序列。 So far, I've got... 到目前为止，我已经...

with open('DNA_sequences.csv', 'r') as f:

for line in f:
    columns = line.rstrip("\n").split(",") #remove end of line charcters and split at commas to produce a list
    ensemblID = columns[0] #ensemblID is first element in our list
    gene_sequence = columns [1] #gene_name is second element in list

wasn't sure if I needed the columns or not. 不知道我是否需要这些列。

I've also made a dictionary for the protein sequence, with the amino acid and the corresponding codon. 我还制作了蛋白质序列的字典，其中包含氨基酸和相应的密码子。

protein_sequence= {'TTT': 'F', 'CTT': 'L', 'GAT':'D'} etc.

So I'm wondering how to I split the gene sequence in my file into codons, then pass it through the dictionary so I get the sequence of amino acid names. 因此，我想知道如何将文件中的基因序列分成密码子，然后将其通过词典，以便获得氨基酸名称的序列。

i.e. gene_sequence= TTTCTTTGAT to protein_sequence= FLD

(Sorry for being so incompetent!) （很抱歉！）

Answer 1

so to load the csv I'd use the csv module like so: 所以要加载csv，我会像这样使用csv模块：

import csv

with open(filepath) as csvFile:
    reader = csv.reader(csvFile)
    data = [row for row in reader]

then to convert the gene sequence: 然后转换基因序列：

geneSeq = "TTTCTTTGAT"

acids = [geneSeq[i:i+3] for i in range(0, len(geneSeq), 3)]

proteinSequenceString = ""
for a in acids:
    proteinSequenceString += protein_sequence[a]

Answer 2

You can iterate over gene_sequence in chunks of 3 and lookup codons in your dictionary: 您可以在3个大块中遍历gene_sequence并在字典中查找密码子：

>>> gene_sequence = 'TTTCTTGAT'
>>> protein_sequence = {'TTT': 'F', 'CTT': 'L', 'GAT': 'D'}
>>> ''.join(protein_sequence[gene_sequence[i:i+3]] for i in range(0, len(gene_sequence), 3))
'FLD'

如何在Python上使用字典从文件更改值

问题描述

2 个解决方案

解决方案1
0 2016-11-25 15:23:35

解决方案2
0 2016-11-25 15:23:39

如何在Python上使用字典从文件更改值

问题描述

2 个解决方案

解决方案1 0 2016-11-25 15:23:35

解决方案2 0 2016-11-25 15:23:39

解决方案1
0 2016-11-25 15:23:35

解决方案2
0 2016-11-25 15:23:39