简体   繁体   中英

How to design table relationship where the foreign key can mean “all rows”, “some rows” or “one row”?

I hope you can help me with this. I've used pseudocode to keep everything simple.

I have a table which describes locations.

location_table
location = charfield(200) # New York, London, Tokyo

A product manager now wants locations to be as follows:

Global = select every location
Asia = select every location in Asia
US = select every location in US
Current system = London (etc.)

This is my proposed redesign.

location_table
location = charfield(200) # New York, London, Tokyo
continent = foreign key to continent_table

continent_table
continent = charfield(50) # "None", "Global", Asia, Europe

But this seems horrible. It means in my code I'll always need to check if the customer is using "global" or "none", and then select the corresponding location records. For example, there will be code like this scattered everywhere:

get continent 
if continent is global, select everything from location_table
else if continent is none, select location from location_table
else select location from location_table where foreign key is continent

My feeling is this is a known problem, and there is a known solution for it. Any ideas?

Thank you.

Use levels:

0   -> None
00  -> Global
001 -> Europe
002 -> Asia
003 -> Africa

select location from location_table where continent like '[value]%'

Using a fixed length code, you can prefix regions, and then add one more digit for a region inside a region, and so on.

Ok, let me try to improve it.

Consider the world, it has the minimum level (or maximum depending on how you see it)

World ID = '0' (1 digit)

Now, select how you want to divide the world: (Continents, Half-Continents, ...) and assign the next level.

Europe ID  = '01' (First digit World + Second digit Europe)
Asia ID    = '02' 
America ID = '03'
...

Next Level: Countries. (At least 2 digits)

England ID    = '0101' (World + Continent + Country)
Deutchland ID = '0102'
....
Texas ID      = '0301'
....

Next Level: Regions (2 digits)

Yorkshire ID = '010101' (World + Continent + Country + Region)
....

Next Level: Cities (2 or 3 digits)

London ID = '01010101' (World + Continent + Country + Region + City)

And so on.

Now, the same SELECT some_aggregate, statistics, ... FROM ... can be used for no matter what region, simply change:

WHERE Region like '0%'                        --> The whole world
WHERE Region like '02%'                       --> Asia
WHERE Region like '01010101%'                 --> London
WHERE Region like '02%' AND Region like '01%' --> Asia & Europe

What you seem to have here is a set of locations, and then a set of location groups. Those groups might be all of the locations (global), or a subset of them.

You can build this with an intermediate table between the locations and a new location sets table which associates locations and location sets.

You might build the location set table and the join table so that the individual locations are also location sets, but ones which join only to one location. That way all location selections come from one table -- the location sets.

So you end up with three different types of location set:

  1. Ones which map 1:1 with a location
  2. One which maps 1:all ("global")
  3. Ones which map 1:many (continents and other areas)

It's conceivable that this could be created as a hierarchy, but those queries can be inefficient because the join cardinalities tend to be obscured from the optimiser.

You could do this using a hierarchy, and a self referencing foreign key, eg

LocationID      Name        ParentLocationID        LocationType
------------------------------------------------------------------
    1        Planet Earth      NULL                 Planet
    2           Africa          1                   Continent
    3         Antartica         1                   Continent
    4           Asia            1                   Continent
    5        Australasia        1                   Continent
    6           Europe          1                   Continent
    7        North America      1                   Continent
    8       South America       1                   Continent
    9       United States       7                   Country
    10         Canada           7                   Country
    11         Mexico           7                   Country
    12      California          9                   State
    13      San Diego           12                  City
    14        England           6                   Country
    15      Cornwall            14                  County
    16        Truro             15                  City

Hierarchical data usually requires either recursion, or multiple joins to get all levels, this answer contains links to articles comparing performance on the major DBMS.

Many DBMS now support recursive Common table expressions, and since no DBMS is specified I will use SQL Server syntax because it is what I am most comfortable with, a quick example would be.

DECLARE @LocationID INT = 7; -- NORTH AMERICA

WITH LocationCTE AS
(   SELECT  l.LocationID, l.Name, l.ParentLocationID, l.LocationType
    FROM    dbo.Location AS l   
    WHERE   LocationID = @LocationID
    UNION ALL
    SELECT  l.LocationID, l.Name, l.ParentLocationID, l.LocationType
    FROM    dbo.Location AS l
            INNER JOIN LocationCTE AS c
                ON c.LocationID = l.ParentLocationID
)
SELECT  *
FROM    LocationCTE;

Output based on above sample data

LocationID  Name            ParentLocationID    LocationType
-----------------------------------------------------------------
7           North America   1                   Continent
9           United States   7                   Country
10          Canada          7                   Country
11          Mexico          7                   Country
12          California      9                   State
13          San Diego       12                  City

Online Demo

Supplying a value of 1 (Planet Earth) for the location ID will return the full table, or supplying a locationID of 11 (Mexico) would only return this one row, because there is nothing smaller than this in the sample data.

I'll go with your answer and say that I don't find it quite horrible to look everytime a customer to check if he searches by city or location, or nothing. That would be the role of the backend code and would always lead to different queries depending on what option he chooses.

But I would remove "None", "Global" from the continent table, and just use other queries when these option are not chosen. You would end up with the 3 possibles SQL queries you have, and I don't find it to be bad design per se. Maybe other solution are more performant, but this one seems to be more readable and logical. It's just optional querying with join tables.

Other answer will trade performance/duplication for readability (which isn't a bad thing, depending on how many time you will be relying on this condition in your application, in how many queries you'll be using it, and how many cities you have).

For readability and non-repetition, the best thing would be to concentrate these condition in one SQL function wich take a string parameter and return all location depending on the input (but at the cost of preformance).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM