[英]How to print the n-th elements of arrays with matching first items?
I am working in Python and I have the following data: 我正在使用Python工作,并且有以下数据:
['DDX58_HUMAN', 'gnl|CDD|256537', '819', '923']
['DDX58_HUMAN', 'gnl|CDD|260076', '111', '189']
['DDX58_HUMAN', 'gnl|CDD|260076', '4', '93']
['DDX58_HUMAN', 'gnl|CDD|238005', '258', '410']
['DDX58_HUMAN', 'gnl|CDD|238034', '606', '741']
['DICER_HUMAN', 'gnl|CDD|239209', '886', '1008']
['DICER_HUMAN', 'gnl|CDD|238333', '1681', '1846']
['DICER_HUMAN', 'gnl|CDD|238333', '1296', '1376']
['DICER_HUMAN', 'gnl|CDD|238333', '1547', '1583']
['DICER_HUMAN', 'gnl|CDD|251903', '630', '722']
['DICER_HUMAN', 'gnl|CDD|238005', '58', '209']
['DICER_HUMAN', 'gnl|CDD|238034', '444', '553']
I need to print the 2nd, 3rd, and 4th items after matching first items like this: 我需要在匹配第一个项目后打印第二,第三和第四项,如下所示:
DDX58_HUMAN gnl|CDD|256537 819 923 gnl|CDD|260076 111 189 gnl|CDD|260076 4
93 gnl|CDD|238005 258 410 gnl|CDD|238034 606 741
DICER_HUMAN gnl|CDD|239209 886 1008 gnl|CDD|238333 1681 1846 gnl|CDD|238333
1296 1376 gnl|CDD|238333 1547 1583 gnl|CDD|251903 630 722 gnl|CDD|238005 58
209 gnl|CDD|238034 444 553
How can I achieve this? 我该如何实现?
Here is a sample code for what you want to do : I assume you have this data in python lists you can just traverse each list , and store values in a dictionary based on first element of the list and you will be able to get the unique entries. 这是您要执行的操作的示例代码:我假设您在python列表中拥有此数据,您可以遍历每个列表,然后根据列表的第一个元素将值存储在字典中,您将能够获得唯一的条目。
mylist = [['DDX58_HUMAN', 'gnl|CDD|256537', '819', '923']
,['DDX58_HUMAN', 'gnl|CDD|260076', '111', '189']
,['DDX58_HUMAN', 'gnl|CDD|260076', '4', '93']
,['DDX58_HUMAN', 'gnl|CDD|238005', '258', '410']
,['DDX58_HUMAN', 'gnl|CDD|238034', '606', '741']
,['DICER_HUMAN', 'gnl|CDD|239209', '886', '1008']
,['DICER_HUMAN', 'gnl|CDD|238333', '1681', '1846']
,['DICER_HUMAN', 'gnl|CDD|238333', '1296', '1376']
,['DICER_HUMAN', 'gnl|CDD|238333', '1547', '1583']
,['DICER_HUMAN', 'gnl|CDD|251903', '630', '722']
,['DICER_HUMAN', 'gnl|CDD|238005', '58', '209']
,['DICER_HUMAN', 'gnl|CDD|238034', '444', '553']]
myDict = {}
for items in mylist :
myDict.setdefault(items[0],[]).append(" ".join(x for x in items[1:]))
for k,v in myDict.items():
print(k," : "," ".join(x for x in v))
Output 输出量
DDX58_HUMAN : gnl|CDD|256537 819 923 gnl|CDD|260076 111 189 gnl|CDD|260076 4 93 gnl|CDD|238005 258 410 gnl|CDD|238034 606 741
DICER_HUMAN : gnl|CDD|239209 886 1008 gnl|CDD|238333 1681 1846 gnl|CDD|238333 1296 1376 gnl|CDD|238333 1547 1583 gnl|CDD|251903 630 722 gnl|CDD|238005 58 209 gnl|CDD|238034 444 553
If your data is in .txt file : Just read the text file and remove unwanted braces using re
module and then same above mentioned logic will work. 如果您的数据在.txt文件中:只需阅读文本文件并使用
re
模块删除不需要的花括号,然后上述相同的逻辑将起作用。
import re
with open("data.txt") as mylist :
myDict = {}
mainList = []
for items in mylist.readlines() :
dataString = re.sub(r"[\[[\]]","",items.rstrip()).split(",")
mainList.append(dataString)
myDict = {}
for items in mainList :
myDict.setdefault(items[0],[]).append("".join(x for x in items[1:]))
for k,v in myDict.items():
print(k," : ","".join(x for x in v))
Output : 输出:
'DICER_HUMAN' : 'gnl|CDD|239209' '886' '1008' 'gnl|CDD|238333' '1681' '1846' 'gnl|CDD|238333' '1296' '1376' 'gnl|CDD|238333' '1547' '1583' 'gnl|CDD|251903' '630' '722' 'gnl|CDD|238005' '58' '209' 'gnl|CDD|238034' '444' '553'
'DDX58_HUMAN' : 'gnl|CDD|256537' '819' '923' 'gnl|CDD|260076' '111' '189' 'gnl|CDD|260076' '4' '93' 'gnl|CDD|238005' '258' '410' 'gnl|CDD|238034' '606' '741'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.