简体   繁体   中英

Python: Locate 3 adjacent list items and determine list index of first of them

I need to process weather station data which is in a format like this (SYNOP), where each line represents one measurement and I have thousands of measurements:

line = 'AAXX 01004 60265 32970 03404 10048 20010 38997 48605 51014='

Starting with the 6th block, the blocks are numbered (1xxxx 2xxxx 3xxxx etc, sometimes only 5 blocks but sometimes also more with additional data)

The crucial point is that the number of blocks between the AAXX and the 1xxxx block is not always the same , but I know that 2 blocks before the 1xxxx block there is data I need. To reliably pinpoint that block I would need to determine the position of the 1xxxx block and count backwards from there.

My idea is to split the line along spaces into a list, and then iterate through the list items to find the position in the list of the 1xxxx block.

list = line.split(' ')

But I don't know how to do this iteration. There must be a reasonably elegant way to look for 3 blocks where the first starts with 1, the second with 2, and the third with 3, then return the list index of the first block?

This may be very simple but I'm unable to figure it out, and would be grateful for any tips!

EDIT: To clarify, it's possible that another block starting with 1 appears before the one I need, so the only reliable way to pinpoint the block I need is to ensure that it is followed by one that starts with 2 and another that starts with 3 (that should reduce the chance of a false positive to pretty much 0).

There is more than one way to do this. One way would be to search the list for the index and subtract two:

list[ (i for i, j in enumerate( list ) if j.startswith( "1" ) ).next() - 2 ]

Another way would be to match a regex onto the (unsplit) string:

import re
re.search( "\d{5}(?= \d{5} 1\d{4} 2\d{4} 3\d{4})", line )

This matches a block of five digits, as long as it's followed by 1xxxx 2xxxx 3xxxx where x is any digit.

to iterate over a list is very simple :

l = line.split(' ')
for element in l:
  # element is now one of the strings from your list
  if element[0] == "1":
    print "This block begins by 1"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM