简体   繁体   中英

Many-to-many through many-to-many: is there a need for the middle join?

Simplifying the question to its basics, we have three tables components , programs and users , related in many-to-many relationships with the two intermediate tables program_components and user_programs .

simplified table structure

users
- id (primary key)
- (...)

user_programs
- user_id (foreign key to users id)
- program_id (foreign key to programs id)

programs
- id (primary key)
- (...)

program_components
- program_id (foreign key to programs id)
- component_id (foreign key to components id)

components
- id (primary key)
- (...)

We are integrating user rights on program components within our cloud management system. I stumbled upon on query with many joins one after the other, and was wondering wether the middle table is required or not.

SELECT users.id, components.id FROM components
JOIN program_components ON c.id = program_components.component_id
JOIN programs ON program_components.program_id = programs.id
JOIN user_programs ON programs.id = user_programs.program_id
JOIN users ON user_programs.user_id = users.id
WHERE (...)

Is the middle join necessary, or could we simplify this as

SELECT users.id, components.id FROM components
JOIN program_components ON c.id = program_components.component_id
JOIN user_programs ON program_components.programId = user_programs.programId
JOIN users ON user_programs.user_id = users.id
WHERE (...)

From my tests, they both result in the same dataset, which I fully expected. The question is more about what MySQL expects to get, and which query makes sense from a database perspective.

For readability, I would advise the first version with the extra JOIN, as it promotes intent of joining across multiple tables, going through the common programs table. However I was often told that too many joins are often the wrong way to go about things. [1]

Are there any recommendations in the docs for such queries?


[1] We are refactoring to include a proper user_components table, which will absolve us of these queries, and provide us with more flexibility, but this is outside the scope of the question.

Since you only want the ids from your users and your components tables, there is no reason to join the programs table. It is actually not advisable to do so because it will likely result in a noticeable performance hit.

When writing SQL queries it is always useful to examine how many rows are examined. By joining the programs table you have to examine it's ID row even though you don't need any info from it.

For further info you might be interested in reading this , which explains some ways to boost the performance of your queries

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM