简体   繁体   中英

SQL Work out the average time difference between total rows

I've searched around SO and can't seem to find a question with an answer that works fine for me. I have a table with almost 2 million rows in, and each row has a MySQL Date formatted field.

I'd like to work out (in seconds) how often a row was inserted, so work out the average difference between the dates of all the rows with a SQL query.

Any ideas?

-- EDIT --

Here's what my table looks like

id, name, date (datetime), age, gender

If you want to know how often (on average) a row was inserted, I don't think you need to calculate all the differences. You only need to sum up the differences between adjacent rows (adjacent based on the timestamp) and divide the result by the number of the summands.

The formula

((T1-T0) + (T2-T1) + … + (TN-TN-1)) / N

can obviously be simplified to merely

(TN-T0) / N

So, the query would be something like this:

SELECT TIMESTAMPDIFF(SECOND, MIN(date), MAX(date)) / (COUNT(*) - 1)
FROM atable

Make sure the number of rows is more than 1, or you'll get the Division By Zero error. Still, if you like, you can prevent the error with a simple trick:

SELECT
  IFNULL(TIMESTAMPDIFF(SECOND, MIN(date), MAX(date)) / NULLIF(COUNT(*) - 1, 0), 0)
FROM atable

Now you can safely run the query against a table with a single row.

Give this a shot:

select AVG(theDelay) from (

    select TIMESTAMPDIFF(SECOND,a.date, b.date) as theDelay
    from myTable a
    join myTable b on b.date = (select MIN(x.date) 
                                from myTable x 
                                where x.date > a.date)

) p

The inner query joins each row with the next row (by date) and returns the number of seconds between them. That query is then encapsulated and is queried for the average number of seconds.

EDIT: If your ID column is auto-incrementing and they are in date order, you can speed it up a bit by joining to the next ID row rather than the MIN next date.

select AVG(theDelay) from (

    select TIMESTAMPDIFF(SECOND,a.date, b.date) as theDelay
    from myTable a
    join myTable b on b.date = (select MIN(x.id) 
                                from myTable x 
                                where x.id > a.id)

) p

EDIT2: As brilliantly commented by Mikael Eriksson, you may be able to just do:

select (TIMESTAMPDIFF(SECOND,(MAX(date),MIN(date)) / COUNT(*)) from myTable

There's a lot you can do with this to eliminate off-peak hours or big spans without a new record, using the join syntax in my first example.

Try this:

select avg(diff) as AverageSecondsBetweenDates
from (
    select TIMESTAMPDIFF(SECOND, t1.MyDate, min(t2.MyDate)) as diff
    from MyTable t1
    inner join MyTable t2 on t2.MyDate > t1.MyDate
    group by t1.MyDate
) a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM