简体   繁体   中英

how to skip over lines in a list and add to a file based on index number? python

I have a list, with entries separated by a new line:

lines=[
 '  22414035.537   117786547.45218     -3116.294                                  \n',
 '                  22414038.860    22414035.186    87957488.29217     -2327.383  \n',
 '  20484531.215   107646935.64119     -1170.828                                  \n',
 '                  20484533.402    20484530.680    80385700.15618      -874.345  \n',
 '  22744037.211   119520718.50318      3083.940                                  \n',
 '                  22744039.645    22744037.355    89252483.05018      2302.858  \n']

I have another list:

Indx=[1]

-note, this list can be filled by multiple entries.

I want to iterate through the list and if the index of every two lines is equal to a value from the list, Indx, then i want to do nothing. If the indx of every two lines does not equal a value from Indx, then I want to add that respective line and the next line in the file to a new file.

for example in this case, the new file would contain:

   22414035.537   117786547.45218     -3116.294                                  
                   22414038.860    22414035.186    87957488.29217     -2327.383  
   22744037.211   119520718.50318      3083.940                                  
                   22744039.645    22744037.355    89252483.05018      2302.858  

The issue I am having at the moment is that i cannot skip to the next line in the list. furthermore, my code is adding lines to the file even when the count does equal a value from the Indx list.

heres my code:

EditedRinexObs=open("H:\Uni Work\EditedRinexObs.16o", "a")
for line in lines:
    if ('g') not in line:
        count=(0)
        it=iter(lines)
        for x in indx:
            if count != x:
                EditedRinexObs.writelines(line)
                EditedRinexObs.writelines("\n")
                it.next()
                EditedRinexObs.writelines(line)
                EditedRinexObs.writelines("\n")
                it.next()
                count=count+1
             elif count == x:
                it.next()
                it.next()
                count=count+1
EditedRinexObs.close()   

I hope that makes sense, I'm not really sure whats going on and couldn't find the answer in other question.

If you only want to count pairs, create a set of indexes and zip the lines grouping every two lines together only yielding lines whose index is not in the set:

def pairs(l,inds):
    it = iter(l)
    st = set(inds)
    for ind, (a, b) in enumerate(zip(it, it)):
        if ind not in st:
            yield a,b

print(list(pairs(lines, Indx)))

Which would give you:

[('  22414035.537   117786547.45218     -3116.294                                  \n',
  '                  22414038.860    22414035.186    87957488.29217     -2327.383  \n'),
 ('  22744037.211   119520718.50318      3083.940                                  \n',
  '                  22744039.645    22744037.355    89252483.05018      2302.858  \n')]

If you want to consider overlapping pairs, you could use the pairwise recipe but then you can get duplicates so you need to decide what you want to do if that is the case, it would be something like:

from itertools import izip, tee, count

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)




def pairs(l, inds):
    st = set(inds)
    pr, cn = pairwise(l), count(0)
    prev = 0
    for a, b in pr:
        i = next(cn)
        if i not in st:
            if i - 1 != prev:
                yield a
            yield b
            next(pr)
            prev = i

Here's a fairly simple way that doesn't need iter , although I must confess I like Padraic's solution. :)

I'll output to stdout, to keep things simple. I've changed your Indx to indx to conform to usual Python conventions: simple variable names should begin with a lower-case letter; names that begin with an upper-case letter are used for classes. I've also turned it into a set, since testing membership of a set is generally faster than testing membership of a list, although for very small lists the list is probably faster.

import sys

lines = [
    '  22414035.537   117786547.45218     -3116.294                                  \n',
    '                  22414038.860    22414035.186    87957488.29217     -2327.383  \n',
    '  20484531.215   107646935.64119     -1170.828                                  \n',
    '                  20484533.402    20484530.680    80385700.15618      -874.345  \n',
    '  22744037.211   119520718.50318      3083.940                                  \n',
    '                  22744039.645    22744037.355    89252483.05018      2302.858  \n'
]

indx = set([1])

out = sys.stdout

for i in range(len(lines) // 2):
    if i not in indx:
        out.write(lines[2*i])
        out.write(lines[2*i + 1])            

output

  22414035.537   117786547.45218     -3116.294                                  
                  22414038.860    22414035.186    87957488.29217     -2327.383  
  22744037.211   119520718.50318      3083.940                                  
                  22744039.645    22744037.355    89252483.05018      2302.858  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM