简体   繁体   English

如何添加所有键的值并打印新词典?

[英]How can I add all keys' values and print the new dictionary?

I have a file1 has region information like chromosome1 from position 1 to position 10, looks like: chromosome,start_position,end_position 1,1,10 1,11,20 我有一个file1,它具有从位置1到位置10的区域信息,例如chromosome1,看起来像: chromosome,start_position,end_position 1,1,10 1,11,20

A file2 has values for every position like position 6 on chromosome1 with some value, looks like: chromosome,position,value 1,1,value1 1,2,value2 1,6,value3 1,13,value4 file2具有每个位置的值,例如chromosome,position,value 1,1,value1 1,2,value2 1,6,value3 1,13,value4 1上的位置6,具有一些值,看起来像: chromosome,position,value 1,1,value1 1,2,value2 1,6,value3 1,13,value4

I want to add values in file2 to file1, based on whether positions in file2 belongs to any region in file1 ,something like: chromosome,start_position,end_position,total_value 1,1,10,value1+value2+value3 1,11,20,value4 我想根据file2中的位置是否属于file1中的任何区域,将file2中的值添加到file1中,例如: chromosome,start_position,end_position,total_value 1,1,10,value1+value2+value3 1,11,20,value4

Both files can be more than 10 million lines, Should I do this by looking through every line of file2 (to see if the position be in any region of file1), or making every line of file1 a dictionary (then find value in file2? then add?)? 这两个文件都可以超过1000万行,是否应该通过查看file2的每一行(以查看位置是否在file1的任何区域中),或者将file1的每一行都设置为字典(然后在file2中查找值)来实现?然后加?)?

And how can I get the 'total value' of every line in file1? 以及如何获取file1中每一行的“总值”? Thanks everyone! 感谢大家!

I'm presuming that you're not necessarily looking for the most efficient code, but one that gets the job done? 我以为您不一定要寻找最有效的代码,而是可以完成工作的代码?

I would read the values in file 2 into a dictionary, with the key being a (chromosome, start) pair (presuming that the start and end are always the same in file 2). 我会将文件2中的值读入字典中,密钥是(chromosome, start)对(假定文件2中的开始和结束始终相同)。

Then read file 1 line-by-line, and find all relevant values in your "file 2" dictionary, appending the resultant sum to the end of the line (probably in a new file): 然后逐行读取文件1,并在“文件2”字典中找到所有相关值,并将结果总和附加到行尾(可能在新文件中):

import numpy as np

for line in file1:
    chromosome, start, end = line.split(',')
    total_value = np.sum([file2_dict.get([(chromosome,str(i))], 0) for i in
                          range(int(start), int(end)+1)])
    #do something with the total value, maybe write to another file.
    #Could do:
    new_line = ','.join([chromosome, start, end, total_value]) + '\n'

I'm going to leave the rest of the implementation details to you (such as getting your dictionary from file 2). 我将把其余的实现细节留给您(例如,从文件2获取字典)。 It might be a bit heavy on memory usage, but hopefully not too bad. 可能会占用一些内存,但希望不会太糟。

Note the use of the .get() method with the dictionary lookup - this will make sure that any key that isn't found in the dictionary returns 0 . 请注意在字典查找中使用.get()方法-这将确保字典中找不到的任何键都返回0 You decide if this works for your situation. 您可以决定这是否适合您的情况。 Also note the use of str and int to convert between text and numbers. 还要注意使用strint在文本和数字之间进行转换。 You decide if this is appropriate based on your implementation. 您可以根据自己的实现来决定是否合适。

Also, if you haven't come across Python list comprehensions before, do some research on that. 另外,如果您以前没有遇到过Python列表理解,请对此进行一些研究。 That is what allows us to write the one-liner to get the sum of all relevant values. 这就是允许我们编写单行代码以获取所有相关值之和的原因。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何向字典添加新键? - How can I add new keys to a dictionary? 如何通过将集合作为参数传递来打印字典的键和值? - How can I print the keys and values of a dictionary by passing a set as an argument? 如何在 Python 的 GUI 中打印字典中的所有值 - How can i print all values from a dictionary in a GUI in Python Python字典打印所有键的所有值 - Python dictionary print all values for all keys 如何从Python中的嵌套字典中一起打印所有键和值? - How to print all the keys and values together from a nested dictionary in Python? 如何将dict(键和值)的值分配到新列表中? 以及如何在字符串字典中打印键和值? - How to assign the values for dict(keys and values) into a new list? and also how to print the keys and values in the string dictionary? 给定一个食谱字典,我如何对字典进行排序,以使所有成分键出现在它们出现在食谱值之前 - Given a dictionary of recipes how can I sort the dictionary such that all ingredient keys appear before they are in recipe values 如何将字典中的键更改为大写并在结果字典中添加相同键的值? - How can I change keys in a dictionary to upper case and add values of the same key in the resulting dictionary? 如何将元组中的所有值添加到字典中? - How can I add all values from a tuple into a dictionary? 给定键适合特定间隔,如何在python中打印出字典的值? - How can I print out the values of a dictionary in python given the keys fit a certain interval?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM