简体   繁体   中英

read specific lines from a file using python

I have a file with data like:

   1xxy
   (1gmh)

[white line]
ahdkfkbbmhkkkkkyllllkkjdttyshhaggdtdyrrrutituy
[white line]  
   __________________________________________________
   Intra Chain:
   A 32
   __________________________________________________
   PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
   PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
   PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
   ...
   __________________________________________________

NOW i want to make it like:

   PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
   PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
   PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22
   ...

ie remove all other characters. i tried using:

inp = open('c:/users/rox/desktop/1UMG.out','r')
for line in inp:
    if not line.strip():      # to remove excess whit lines
       continue
    else:
       z = line.strip().replace('\t',' ')
       if z.startswith('PAIR'):
          print z
inp.close()

but this code is also giving me no output. Can't figure out why z.startswith('PAIR') is not working. But up to the previous line it is going fine.

Looks like you are looking only at lines that start with PAIR , so why not something simple like this:

with open('data.txt') as infp:
   for line in infp:
      line = line.strip()
      if line.startswith('PAIR'):
         print(line)

will give:

PAIR 1MNS HE 10 NM A ARG 33 OX1 3.22 32
PAIR 1MNS UR 11 NM A ARG 33 OX2 3.21 12
PAIR IMNS UK 32 NH A ASN 43 OZ1 5.21 22

This output removes the leading 3 spaces, it would be trivial to add them back in if needed.

Note : using with will automatically close the file for you when you are done, or an exception is encountered.

除了@Levon的解释之外,由于文件对象支持迭代器协议,并且根据文件的大小,可以使用列表推导:

[l for l in open('test.txt') if l.startswith('PAIR')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM