简体   繁体   中英

MySQL Query Optimization with subqueries and JOINs on large database

I am new here and have already searched the whole internet for a solution. If I missed something, please let me know!

I use a MySQL MariaDB with several million entries. My query results in timeouts. Even increasing the timeout does not help. The goal of the query is to retrieve the relevant entries per year. My query must join a total of 5 tables. (Unfortunately the structure of the DB is predefined, there is little I can change).

Below you find the affected query. The date is stored as varchar , that's why I use the LIKE % operator. Unfortunately I don't have the permissions to change it.

    SELECT
    spCit.spPN,
    spCit.ipc1,
    spCit.ipc2,
    tbl_patinfo.pn
FROM
    tbl_patinfo
INNER JOIN(
    SELECT
        spIPC.spPN,
        spIPC.ipc1,
        spIPC.ipc2,
        tbl_patcit.pc_pn AS spDocNr
    FROM
        tbl_patcit
    INNER JOIN(
        SELECT DISTINCT
            tbl_ipc.pn AS spPN,
            tbl_ipc.value AS ipc1,
            tbl_ipc.main_cl AS ipc2
        FROM
            tbl_ipc
        RIGHT JOIN(
            SELECT
                tbl_sp.sp_CompanyAlias,
                OrgName,
                infoPN
            FROM
                tbl_sp
            INNER JOIN(
                SELECT DISTINCT
                    tbl_patinfo.pn AS infoPN,
                    tbl_adr.orgname AS OrgName
                FROM
                    tbl_patinfo
                LEFT JOIN tbl_adr ON tbl_patinfo.pn = tbl_adr.pn
                WHERE
                    tbl_patinfo.pub LIKE "%2004"
            ) AS PatPerYear
        ON
            tbl_sp.sp_CompanyAlias = PatPerYear.Orgname
        ) AS spPatents
    ON
        tbl_ipc.pn = spPatents.infoPN
    ) AS spIPC
ON
    tbl_patcit.pn = spIPC.spPN
) AS spCit
ON
    tbl_patinfo.docnr = spCit.spDocNr

Note: The whole query works if I don't make the last JOIN ( tbl_patinfo.docnr = spCit.spDocNr) . So it is probably because of this step.

Explain Select of the query

Thanks in advance for your help. If I can provide any further information, please let me know.

---EDIT---

So I managed to change the date type and now I am able to perform the query for a limit of 50 rows . But without limit I still get a timeout (over 7000s processing time).

I also reduced the VARCHAR(255) on the indexes where ever I could. Not sure whether this has an impact, but this reduced the key_len (see latest explain select). Here is the new code:

 SELECT
        spCit.spPN,
        spCit.ipc1,
        spCit.ipc2,
        tbl_patinfo.pn
    FROM
        tbl_patinfo
    INNER JOIN(
        SELECT
            spIPC.spPN,
            spIPC.ipc1,
            spIPC.ipc2,
            tbl_patcit.pc_pn AS spDocNr
        FROM
            tbl_patcit
        INNER JOIN(
            SELECT DISTINCT
                tbl_ipc.pn AS spPN,
                tbl_ipc.value AS ipc1,
                tbl_ipc.main_cl AS ipc2
            FROM
                tbl_ipc
            RIGHT JOIN(
                SELECT
                    tbl_sp.sp_CompanyAlias,
                    OrgName,
                    infoPN
                FROM
                    tbl_sp
                INNER JOIN(
                    SELECT DISTINCT
                        tbl_patinfo.pn AS infoPN,
                        tbl_adr.orgname AS OrgName
                    FROM
                        tbl_patinfo
                    LEFT JOIN tbl_adr ON tbl_patinfo.pn = tbl_adr.pn
                    WHERE
                        YEAR(tbl_patinfo.pub) = 2004
                ) AS PatPerYear
            ON
                tbl_sp.sp_CompanyAlias = PatPerYear.Orgname
            ) AS spPatents
        ON
            tbl_ipc.pn = spPatents.infoPN
        ) AS spIPC
     ON
        tbl_patcit.pn = spIPC.spPN
    ) AS spCit
    ON
        tbl_patinfo.docnr = spCit.spDocNr

Here's the new EXPLAIN select

I'm very thankful for any idea!

You are out of luck on that one. A LIKE '%...' filter is unindexable. There are things you could do about it, eg create a dynamic column, or a non-dynamic column maintained by a trigger, that contains the reverse of the tbl_patinfo.pub, and then instead of tbl_patinfo.pub LIKE '%2004' you could say tbl_patinfo.revpub LIKE '4002%'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM