简体   繁体   English

如何打印匹配第一个项目的数组的第n个元素?

[英]How to print the n-th elements of arrays with matching first items?

I am working in Python and I have the following data: 我正在使用Python工作,并且有以下数据:

['DDX58_HUMAN', 'gnl|CDD|256537', '819', '923']
['DDX58_HUMAN', 'gnl|CDD|260076', '111', '189']
['DDX58_HUMAN', 'gnl|CDD|260076', '4', '93']
['DDX58_HUMAN', 'gnl|CDD|238005', '258', '410']
['DDX58_HUMAN', 'gnl|CDD|238034', '606', '741']
['DICER_HUMAN', 'gnl|CDD|239209', '886', '1008']
['DICER_HUMAN', 'gnl|CDD|238333', '1681', '1846']
['DICER_HUMAN', 'gnl|CDD|238333', '1296', '1376']
['DICER_HUMAN', 'gnl|CDD|238333', '1547', '1583']
['DICER_HUMAN', 'gnl|CDD|251903', '630', '722']
['DICER_HUMAN', 'gnl|CDD|238005', '58', '209']
['DICER_HUMAN', 'gnl|CDD|238034', '444', '553']

I need to print the 2nd, 3rd, and 4th items after matching first items like this: 我需要在匹配第一个项目后打印第二,第三和第四项,如下所示:

DDX58_HUMAN gnl|CDD|256537 819 923 gnl|CDD|260076 111 189 gnl|CDD|260076 4 
93 gnl|CDD|238005 258 410 gnl|CDD|238034 606 741
DICER_HUMAN gnl|CDD|239209 886 1008 gnl|CDD|238333 1681 1846 gnl|CDD|238333 
1296 1376 gnl|CDD|238333 1547 1583 gnl|CDD|251903 630 722 gnl|CDD|238005 58 
209 gnl|CDD|238034 444 553

How can I achieve this? 我该如何实现?

Here is a sample code for what you want to do : I assume you have this data in python lists you can just traverse each list , and store values in a dictionary based on first element of the list and you will be able to get the unique entries. 这是您要执行的操作的示例代码:我假设您在python列表中拥有此数据,您可以遍历每个列表,然后根据列表的第一个元素将值存储在字典中,您将能够获得唯一的条目。

mylist = [['DDX58_HUMAN', 'gnl|CDD|256537', '819', '923']
,['DDX58_HUMAN', 'gnl|CDD|260076', '111', '189']
,['DDX58_HUMAN', 'gnl|CDD|260076', '4', '93']
,['DDX58_HUMAN', 'gnl|CDD|238005', '258', '410']
,['DDX58_HUMAN', 'gnl|CDD|238034', '606', '741']
,['DICER_HUMAN', 'gnl|CDD|239209', '886', '1008']
,['DICER_HUMAN', 'gnl|CDD|238333', '1681', '1846']
,['DICER_HUMAN', 'gnl|CDD|238333', '1296', '1376']
,['DICER_HUMAN', 'gnl|CDD|238333', '1547', '1583']
,['DICER_HUMAN', 'gnl|CDD|251903', '630', '722']
,['DICER_HUMAN', 'gnl|CDD|238005', '58', '209']
,['DICER_HUMAN', 'gnl|CDD|238034', '444', '553']]

myDict = {}
for items in mylist :
    myDict.setdefault(items[0],[]).append(" ".join(x for x in items[1:]))

for k,v in myDict.items():

    print(k," : "," ".join(x for x in v))

Output 输出量

DDX58_HUMAN  :  gnl|CDD|256537 819 923 gnl|CDD|260076 111 189 gnl|CDD|260076 4 93 gnl|CDD|238005 258 410 gnl|CDD|238034 606 741
DICER_HUMAN  :  gnl|CDD|239209 886 1008 gnl|CDD|238333 1681 1846 gnl|CDD|238333 1296 1376 gnl|CDD|238333 1547 1583 gnl|CDD|251903 630 722 gnl|CDD|238005 58 209 gnl|CDD|238034 444 553

If your data is in .txt file : Just read the text file and remove unwanted braces using re module and then same above mentioned logic will work. 如果您的数据在.txt文件中:只需阅读文本文件并使用re模块删除不需要的花括号,然后上述相同的逻辑将起作用。

import re
with open("data.txt") as mylist :

    myDict = {}
    mainList = []
    for items in mylist.readlines() :
        dataString = re.sub(r"[\[[\]]","",items.rstrip()).split(",")
        mainList.append(dataString)

myDict = {}
for items in mainList :
    myDict.setdefault(items[0],[]).append("".join(x for x in items[1:]))

for k,v in myDict.items():

    print(k," : ","".join(x for x in v))

Output : 输出:

'DICER_HUMAN'  :   'gnl|CDD|239209' '886' '1008' 'gnl|CDD|238333' '1681' '1846' 'gnl|CDD|238333' '1296' '1376' 'gnl|CDD|238333' '1547' '1583' 'gnl|CDD|251903' '630' '722' 'gnl|CDD|238005' '58' '209' 'gnl|CDD|238034' '444' '553'
'DDX58_HUMAN'  :   'gnl|CDD|256537' '819' '923' 'gnl|CDD|260076' '111' '189' 'gnl|CDD|260076' '4' '93' 'gnl|CDD|238005' '258' '410' 'gnl|CDD|238034' '606' '741'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM