简体   繁体   中英

Speed up php with mysql query (joins vs creating a table at an interval)

I have a website that has about 50,000 people in a database. To produce my statistics I need to pull data from 3 tables and join them together. More data comes in daily so the page is getting slower and slower. I was wondering if it would be possible to create a table that joins all the data I need and then run my php off that table. It would be great if the table creation could run hourly or some other set internal so that new data is included. Is this possible and advisable? Can you point me to some resources?

I am using mysql for the database.

Thanks!

I have 3 tables here - the village level, the resident level, and if they are absent, an absent table with their results.

          SELECT EU, sum(TF) as TFsum, sum(TT) as TTsum, sum(KID) as Nkid, 
          sum(ADULT) as Nadult

                    from 
                (select EU, b.name as Person,

                    CASE
                       WHEN b.RIGHT_EYE_TF=1 THEN 1
                       WHEN b.LEFT_EYE_TF=1 THEN 1
                       WHEN c.RIGHT_EYE_TF=1 THEN 1
                       WHEN c.LEFT_EYE_TF=1 THEN 1
                       ELSE 0
                     END AS TF,
                    CASE
                       WHEN b.RIGHT_EYE_TT=1 THEN 1
                       WHEN b.LEFT_EYE_TT=1 THEN 1
                       WHEN c.RIGHT_EYE_TT=1 THEN 1
                       WHEN c.LEFT_EYE_TT=1 THEN 1
                       ELSE 0
                     END AS TT,

                    CASE
                      WHEN AGE <= 9 THEN 1
                      ELSE 0
                    END AS KID,
                    CASE
                      WHEN AGE >= 15 THEN 1
                      ELSE 0
                    END AS ADULT 


                    from 
                    villagedb a LEFT JOIN residentdb b
                    ON
                    a.CLUSTER = b.RES_CLUSTER
                    LEFT JOIN
                    absentdb c
                    on
                    b.RES_HOUSEHOLD_ID=c.RES_HOUSEHOLD_ID AND 
                    b.NAME = c.NAME
                    GROUP BY EU, b.name

                    ) S

                    GROUP BY EU

If you need to create statistic on large amount of data, and often, best approach is to do denormalization of data in tables.

In plain English create new tables, populate it with data which you would get from joins, and when you insert data in your old tables, also populate this tables. In this way you will speed up significtly reports. Because joins are not fast, especially with lot of data, having duplicated data is much faster, but you need to work harder at having data in sync all the time.

Try the following:

  1. Add indexes
  2. Normalize tables (i would suggest reading up on normalization, in order to improve the performance your tables have to be at least in first 3 normal forms and if you further want to normalize it you can use BCNF.

You can create a table out of two tables that have common fields. Instead of joining the tables you would have something like cache table where you select necessary data from.

I see 2 issues here:

  • first one already pointed - optimize query and db ( indexes, denormalization, views, stored procedures etc)
  • second - display limited results like top 100 only ( pagination let say)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM