排序列表列表以获取最后一列的唯一ID

Question

I have this data saved in a file: 我将这些数据保存在文件中：

['5',60680,60854,'gene_id "ENS1"']
['5',59106,89211,'gene_id "ENS1"']
['5',58686,58765,'gene_id "ENS1"']
['5',80835,93381,'gene_id "ENS2"']
['5',55555,92223,'gene_id "ENS2"']
['5',73902,74276,'gene_id "ENS2"']

I need help with python to get an output which ensures that items in the 4th column appear only when the second column has the minimum value and the third column has a maximum value within a 4th column item. 我需要python的帮助才能获得输出，该输出可确保仅在第二列具有最小值且第三列在第四列项目中具有最大值时才显示第四列中的项目。 So I want my output to look like this: 所以我希望我的输出看起来像这样：

['5',58686,89211,'gene_id "ENS1"']
['5',55555,93381,'gene_id "ENS2"']

Each item in the 4th column should only appear once. 第4列中的每个项目应只出现一次。 How can I also get rid of the [] around the data. 我还如何摆脱数据中的[]。 Thank you. 谢谢。

Answer 1

>>> from itertools import groupby
>>> for i, j in groupby(lst, key=lambda x: x[3]):
    t = list(zip(*j))
    print(t[0][0], min(t[1]), max(t[2]), t[3][0])


5 58686 89211 gene_id "ENS1"
5 55555 93381 gene_id "ENS2"

It's not clear, what do you mean by getting rid of [] , these are just syntax for python lists. 尚不清楚，您摆脱[]是什么意思，这些只是python列表的语法。

Answer 2

import re
pat = re.compile("\['[^']+',([^,]+),([^,]+),'([^']+)']")

ch = '''
['5',60680,60854,'gene_id "ENS1"']
['5',59106,89211,'gene_id "ENS1"']
['5',58686,58765,'gene_id "ENS1"']
['5',80835,93381,'gene_id "ENS2"']
['5',55555,92223,'gene_id "ENS2"']
['5',73902,74276,'gene_id "ENS2"']'''

li = pat.findall(ch)
print li

deekmin = {}
deekmax = {}
for a,b,c in li[1:]:
    if c in deekmin:
        if a<deekmin[c]:
            deekmin[c] = a
        if b>deekmax[c]:
            dekkmax[c] = b
    else:
        deekmin[c] = a
        deekmax[c] = b

res = [ (deekmin[c],deekmax[c],c) for c in deekmin ]
print res

排序列表列表以获取最后一列的唯一ID

问题描述

2 个解决方案

解决方案1
0 2010-12-21 22:49:15

解决方案2
0 2010-12-21 23:04:16

排序列表列表以获取最后一列的唯一ID

问题描述

2 个解决方案

解决方案1 0 2010-12-21 22:49:15

解决方案2 0 2010-12-21 23:04:16

解决方案1
0 2010-12-21 22:49:15

解决方案2
0 2010-12-21 23:04:16