[英]accumulating values to corresponding key in a dictionary?
I am trying to locate the positions of bases (A,C,G,T) and put them into a dictionary corresponding to their positions.我正在尝试定位碱基(A、C、G、T)的位置并将它们放入与其位置对应的字典中。
I am working from a text file that has lines of bases like below我正在使用一个文本文件,该文件具有如下所示的基数行
----T
C
-C
-----G
C
-----C
---T
----A
----C
-----G
From the information above, I know that从上面的信息,我知道
C is at the 1st position C 在第 1 个位置
C is at the 2nd position C 在第 2 位
3rd position base is unknown第三位基数未知
T is at the 4th position T 在第 4 位
C, A, T are at the 5th position C、A、T 位于第 5 位
C, G are are at the 6th position C、G 在第 6 位
So far, I have written the code below到目前为止,我已经写了下面的代码
def chunks(chunks_file):
set_bases = {}
with open(chunks_file) as file:
for line in file:
for character in line:
if character.isalpha():
letter = character
position = line.find(letter) + 1
set_bases[position] = {letter}
return set_bases
my current output is:我目前的输出是:
{5: {'C'}, 1: {'C'}, 2: {'C'}, 6: {'G'}, 4: {'T'}}
where as the desired output would be :所需的输出是:
{1: {'C'}, 2: {'C'}, 4: {'T'}, 5: {'C', 'A', 'T'}, 6: {'C', 'G'}}
It seems to me that values are not being added to already existing keys, but the new values are replacing the old values.在我看来,值并未添加到现有的键中,但新值正在替换旧值。
How can I solve this problem?我怎么解决这个问题?
You can do it the following way, taking into consideration that you have a txt
file:考虑到您有一个
txt
文件,您可以通过以下方式进行操作:
outDict = {}
with open('data.txt', 'r') as inFile:
lines = [line.strip() for line in inFile if not line == '\n']
outDict = dict((str(line.count('-')+1),set()) for line in lines)
for line in lines:
outDict[str(line.count('-')+1)].update(line[-1])
print(outDict)
Result:结果:
{'5': {'C', 'A', 'T'}, '1': {'C'}, '2': {'C'}, '6': {'C', 'G'}, '4': {'T'}}
I can suggest the following improvements:我可以提出以下改进建议:
import collections
def chunks(filename):
bases = collections.defaultdict(set)
with open(filename) as f:
for line in f:
line = line.strip()
if len(line) > 0:
for i, char in enumerate(line):
if char.isalpha():
position = i + 1
bases[position].add(char)
return bases
collections.defaultdict
so you don't have to check if the position is present in the dict or not.collections.defaultdict
因此您不必检查该位置是否存在于 dict 中。enumerate()
when iterating over the lines, so you already have the position and don't need to call line.find()
.enumerate()
,所以你已经有了位置,不需要调用line.find()
。 This code can be used as follows:此代码可以按如下方式使用:
>>> d = chunks('your-file-name.txt')
>>> d
defaultdict(<class 'set'>, {5: {'T', 'C', 'A'}, 1: {'C'}, 2: {'C'}, 6: {'G', 'C'}, 4: {'T'}})
>>> dict(d)
{5: {'C', 'A', 'T'}, 1: {'C'}, 2: {'C'}, 6: {'G', 'C'}, 4: {'T'}}
>>> for k, v in sorted(d.items()):
... print(k, v)
1 {'C'}
2 {'C'}
4 {'T'}
5 {'C', 'A', 'T'}
6 {'G', 'C'}
Try something like this:尝试这样的事情:
def chunks(chunks_file):
set_bases = {}
with open(chunks_file) as file:
for line in file:
for character in line:
if character.isalpha():
letter = character
position = line.find(letter) + 1
if position in set_bases:
set_bases[position].append(letter)
else:
set_bases[position] = [letter]
return set_bases
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.