如何为循环打印迭代的最后一个值

Question

I am trying to print the date interval where I have no corresponding data. 我正在尝试打印没有相应数据的日期间隔。 For exemple, I want to be able to say that I don't have data recorded from 2008/04/28 22:00 to 2008/04/29 00:00 and from 2008/10/06 09:45 to 2008/10/06 10:15, etc. 例如，我想说的是，我没有从2008/04/28 22:00到2008/04/29 00:00以及从2008/10/06 09:45到2008/10的数据记录/ 06 10:15，等等。

Here is a part of my file: 这是我文件的一部分：

023004         2008/04/28 22:00                   AR

023004         2008/04/28 22:15                   AR

023004         2008/04/28 22:30                   AR

023004         2008/04/28 22:45                   AR

023004         2008/04/28 23:00                   AR

023004         2008/04/28 23:15                   AR

023004         2008/04/28 23:30                   AR

023004         2008/04/28 23:45                   AR

023004         2008/04/29 00:00    49.37

023004         2008/04/29 00:15    51.41

023004         2008/04/29 00:30    50.96

023004         2008/04/29 00:45    53.73

023004         2008/10/06 09:15    2.587 

023004         2008/10/06 09:30    2.587 

023004         2008/10/06 09:45    2.587 

023004         2008/10/06 10:00                   A

023004         2008/10/06 10:15    2.624

023004         2008/10/06 10:30    2.624

023004         2008/10/06 10:45    2.643

023004         2008/10/06 11:00    2.662

023004         2008/10/06 11:15    2.680

023004         2008/10/06 11:30                   A

023004         2008/10/06 11:45                   A

023004         2008/10/06 12:00                   A

023004         2008/10/06 12:15                   A

023004         2008/10/06 12:30                   A

I tried this code: 我尝试了这段代码：

fich = "test1.txt"

f = open(fich, "rb")
for line in f:
    a = line.split()[3].isalpha()
    if a == False:
        print "valeur"
    else:
        print "Pas de valeur de precipitation du", line.split()[1], "a", line.split()[2], "h ", "au", line.split()[1], line.split()[2], "h "

But it does not give me the interval of value I am looking for. 但这并没有给我寻找价值的间隔。 It just tells me if I have a data or not. 它只是告诉我是否有数据。

I want to be able to print the first and last value of each missing data interval. 我希望能够打印每个缺少的数据间隔的第一个和最后一个值。

Answer 1

this approach will give you all of the ranges for which there is no data - assuming there is a constant 15min step between each data point..it basically filters out the dates for which there is no data, and then groups them into chunks where there is a 15min gap between each data point and if not puts the next bit of data into another chunk. 这种方法将为您提供所有没有数据的范围-假设每个数据点之间有恒定的15分钟步长。它基本上会过滤掉没有数据的日期，然后将它们分组到有数据的地方是每个数据点之间15分钟的间隔，如果没有，则将下一数据位放入另一个块中。

I copied and pasted your sample text into excel and saved it as .csv so this should work with minimal alteration if any: 我将您的示例文本复制并粘贴到excel中，并将其另存为.csv，因此，如果有什么改动，它应该可以进行最少的改动：

import pandas as pd
import os
delta = pd.Timedelta(15,'m') #define time step
df = pd.read_csv('test.csv',header=0) #read in the data
df['date']=pd.to_datetime(df['date']) #convert the date column to datetime
df = df[pd.notnull(df['date'])] #drop all rows (spaces) with nothing in them
df = df.reset_index(drop=True) #renumber the index

missing_dates=df[df['val'].isnull()]['date'] #dates with no data associated with them
diffs = missing_dates.diff() #difference between missing dates
ranges=[] 
tmp=[]
for i in diffs.index: #loop through the differences
    if pd.isnull(diffs.loc[i]): #first difference always NaT because nothing before it
        tmp.append(missing_dates.loc[i]) #add to temp list
    elif diffs.loc[i] == delta: #if difference is delta, then it is in same chunk as previous data point
        tmp.append(missing_dates.loc[i]) #add to tmp list
    else: #once you reach a data point that is in the next chunk
        ranges.append(tmp) #append temp list to ranges of missing data
        tmp=[] #re-initialize the temp list
        tmp.append(missing_dates.loc[i]) #append value to first position of the list representing the next chunk

ranges.append(tmp)

This will give you a list of lists, where each list contains all the times for which there is no data and that are spaced 1 time step apart 这将为您提供一个列表列表，其中每个列表包含所有时间，这些时间没有数据并且间隔1步

It will not however include the date before/after the date with missing data 但是，它将不包含缺少数据的日期之前/之后的日期

output looks like: 输出看起来像：

for r in ranges:
    print('No data between '+str(r[0])+' to '+str(r[-1]))

outputs: 输出：

No data between 2008-04-28 22:00:00 to 2008-04-28 23:45:00
No data between 2008-10-06 10:00:00 to 2008-10-06 10:00:00
No data between 2008-10-06 11:30:00 to 2008-10-06 12:30:00

probably not the best approach out there, but will hopefully aim you in a direction that helps 可能不是最好的方法，但希望能将您的目标对准有助于

如何为循环打印迭代的最后一个值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-08-16 20:47:29

如何为循环打印迭代的最后一个值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-08-16 20:47:29

解决方案1
0 已采纳 2019-08-16 20:47:29