The lines of my text file are:
<< end of ENERGY.
iupac_m_486_> OE1/2 will be swapped: -136.1396 1 1
openf___224_> Open Dominio1.BL00100001.pdb
wrpdb___568_> Residues, atoms, selected atoms: 268 2115 2115
>> Summary of successfully produced loop models:
Filename molpdf
----------------------------------------
Dominio1.BL00010001.pdb 24.69530
Dominio1.BL00020001.pdb 14.33748
Dominio1.BL00030001.pdb 30.53454
Dominio1.BL00040001.pdb 23.82516
Dominio1.BL00050001.pdb 27.48684
Dominio1.BL00060001.pdb 18.17364
Dominio1.BL00070001.pdb 30.98407
Dominio1.BL00080001.pdb 17.19927
Dominio1.BL00090001.pdb 19.02460
Dominio1.BL00100001.pdb 22.57086
I want to create a code that selects the number line (last 10 lines)that has the smallest number (identify),and read the name of the .pdb (just the 24 characters of the line that has the smallest number).Cause, I need identify what's the .pdb that has the smallest number, and use it like a string in other script, like this:
model='%s'%R
where '%s'%R is the name of .pdb that i need
How can I do it?
You need to use min
function with a proper key :
>>> min(s.split('\n\n'),key=lambda x:float(x.split()[-1])).split()[0]
'Dominio1.BL00020001.pdb'
Demo :
>>> s="""Dominio1.BL00010001.pdb 24.69530
...
... Dominio1.BL00020001.pdb 14.33748
...
... Dominio1.BL00030001.pdb 30.53454
...
... Dominio1.BL00040001.pdb 23.82516
...
... Dominio1.BL00050001.pdb 27.48684
...
... Dominio1.BL00060001.pdb 18.17364
...
... Dominio1.BL00070001.pdb 30.98407
...
... Dominio1.BL00080001.pdb 17.19927
...
... Dominio1.BL00090001.pdb 19.02460
...
... Dominio1.BL00100001.pdb 22.57086"""
>>> min(s.split('\n\n'),key=lambda x:float(x.split()[-1]))
'Dominio1.BL00020001.pdb 14.33748'
>>> min(s.split('\n\n'),key=lambda x:float(x.split()[-1])).split()[0]
'Dominio1.BL00020001.pdb'
A normal file read operation will do
data = file.readlines()
pdb_files = []
float_values = []
for line in data:
pdb,float_value = line.split()
pdb_files.append(pdb)
float_values.append(float(float_value))
min_float_index = float_values.indexof(min(float_values))
print pdb_files.index(min_float_index)
This code stores the data in two lists, and finds the least of the float values given. Then prints the corresponding entry of the pdb filename
Try this:
def get_minimal_value_entry(file_name):
with open(file_name, 'r') as f:
# the value of a line is the second member of 'split' result
key = lambda x: float(x.strip().split()[1])
return min(f, key=key).split()[0]
# 'test' file holds the data...
print get_minimal_value_entry('test')
# prints Dominio1.BL00020001.pdb
If you have empty lines use itertools.ifilter
to filter empty lines:
from itertools import ifilter
def get_minimal_value_entry(file_name):
with open(file_name, 'r') as f:
# the value of a line is the second member of 'split' result
key = lambda x: float(x.strip().split()[1])
return min(ifilter(lambda x: x.split(), f), key=key).split()[0]
# 'test' file holds the data...
print get_minimal_value_entry('test')
# prints Dominio1.BL00020001.pdb
I'd use Python re
.
file.txt
Dominio1.BL00010001.pdb 24.69530
Dominio1.BL00020001.pdb 14.33748
Dominio1.BL00030001.pdb 30.53454
Dominio1.BL00040001.pdb 23.82516
Dominio1.BL00050001.pdb 27.48684
Dominio1.BL00060001.pdb 18.17364
Dominio1.BL00070001.pdb 30.98407
Dominio1.BL00080001.pdb 17.19927
Dominio1.BL00090001.pdb 19.02460
Dominio1.BL00100001.pdb 22.57086
sorts.py
import re
lines = open('file.txt').readlines() # readlines
lines = [i.strip() for i in lines] # remove newlines
lines = [re.sub('\s+', ' ', i) for i in lines] # remove extra spaces
lines = [i.split(' ') for i in lines] # split by space
lines = [i for i in lines if i != ['']] # remove empty lines
lines = sorted(lines, key = lambda i: float(i[1])) # sort by id
print lines[0][0] # print item with least id
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.