简体   繁体   中英

MySQL: ignore results where difference is more than x between rows

I have a simple PHP/HTML page that runs MySQL queries to pull temperature data and display on a graph. Every once in a while there is some bad data read from my sensors (DHT11 Temp / RH sensors, read by Arduino), where there will be a spike that is too high or too low, so I know it's not a good data point. I have found this is easy to deal with if it is "way" out of range, as in not a sane temperature, I just use a BETWEEN statement to filter out any records that are not possibly true.

I do realize that ultimately this should be fixed at the source so these bad readings never post in the first place, however as a debugging tool, I do actually want to record those errors in my DB, so I can track down the points in time when my hardware was erroring.

However, this does not help with the occasional spikes that actually fall within the range of sane temperatures. For example if it is 65 F outside, and the sensor occasionally throws an odd reading and I get a 107 F reading, it totally screws up my graphs, scaling, etc. I cant filter that with a BETWEEN (that I know of), because 107 F is actually a practical summer time temp in my region.

Is there a way to filter out values based on their neighboring rows? Can I do something like, if I am reading five rows for the sake of simplicity, and their result is: 77,77,76,102,77 ... that I can say "anything that is more than (x) difference between sequential rows, ignore it because it's bad data" ?

[/longWinded]

It is hard to answer without your schema so I did a SQLFiddle to reproduce your problem.

You need to average the temperature between a time frame and then compare this value with the current row. If the difference is too big, then we don't select this row. In my Fiddle this is done by :

abs(temp - (SELECT AVG(temp) FROM temperature AS t 
            WHERE 
              t.timeRead BETWEEN 
                           DATE_ADD(temperature.timeRead, interval-3 HOUR) 
                         AND
                           DATE_ADD(temperature.timeRead, interval+3 HOUR))) < 8

This condition is calculating the average of the temprature of the last 3 hours and the next 3 hours. If the difference is more than 8 degrees then we skip this row.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM