简体   繁体   中英

How to remove transients in time-series data in Python (or Pandas)?

I have a time-series set of data recording the flow and temperature of a heat pump. The first few minutes when the system kicks on, the flows and temperatures aren't fully developed and I'd like to filter them out.

Time (min)  Flow    Supply T    Return T
….          
45  0   0   0
46  0   0   0
47  1.338375    92.711328   78.72152
48  2.267975    82.578552   74.239624
49  0.778125    96.073136   74.288664
50  0.778125    101.3998    74.686288
51  0.7885  102.1189    74.490528
….          

For instance, the first 3 minutes of operation (from 47-49 minutes), don't do any calculations with the data. I can do that with a loop, but the data set is very large (>200 mb text file) and takes a really long time to loop through. I was wondering if there's a more efficient way to pull it out, perhaps using Pandas?

Any help or advice is appreciated! Thanks in advance!!

Please try the following, I think it should work, basically it filters out the rows where row at n-3 does not equal 0 and is not NaN this assumes that when there is no flow you have a value of 0:

In [12]:

df[(df.Flow.shift(3)!=0) & (df.Flow.shift(3).notnull())]
Out[12]:
   Time_(min)      Flow  Supply_T   Return_T
5          50  0.778125  101.3998  74.686288
6          51  0.788500  102.1189  74.490528

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM