简体   繁体   English

sql表设计以获取具有多个包含和排除条件的记录

[英]sql table design to fetch records with multiple inclusion and exclusion conditions

We want to select customers based on following parameters ie customer should be in: 我们要根据以下参数选择客户,即客户应位于:

  1. specific city ie cityId=1,2,3... 特定城市,即cityId = 1,2,3 ...

  2. specific customerId should be excluded ie customerId=33,2323,34534... 应排除特定的customerId,即customerId = 33,2323,34534 ...

  3. specific age ie 5 years, 7 years, 72 years... 特定年龄,例如5岁,7岁,72岁...

This inclusion & exclusion list can be any long. 此包含和排除列表可以很长。

How should we design database for this: 我们应该如何设计数据库:

  1. Create separate table 'customerInclusionCities' for these inclusion cities and do like: 为这些包含城市创建单独的表“ customerInclusionCities”,并执行以下操作:

select * from customers where cityId in (select cityId from customerInclusionCities) 从城市ID所在的客户中选择*(从customerInclusionCities选择城市ID)

Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries: 我们为年龄做一些处理,创建表“ customerEligibleAge”,其中包含所有符合条件的年龄条目:

ie select * from customers where age in (select age from customerEligibleAge ) select * from customers where age in (select age from customerEligibleAge

and Create separate table 'customerIdToBeExcluded' for excluding customers: 并创建单独的表“ customerIdToBeExcluded”以排除客户:

ie select * from customers where customerId not in (select customerId from customerIdToBeExcluded ) select * from customers where customerId not in (select customerId from customerIdToBeExcluded

OR 要么

  1. Create One table with Category and Ids. 使用类别和ID创建一个表。 ie Category1 for cities, Category2 for CustomerIds to be excluded. 即排除城市的Category1,排除CustomerId的Category2。

Which approach is better, creating one table for these parameters OR creating separate tables for each list ie age, customerId, city? 为这些参数创建一个表或为每个列表(即年龄,customerId,城市)创建单独的表,哪种方法更好?

If you use the database only that operation, I recommend to use the first solution. 如果仅使用该操作执行数据库操作,则建议使用第一种解决方案。 Also the first solution is very simple to deploy. 同样,第一个解决方案的部署非常简单。

The second solution fills up with junk the DB. 第二种解决方案用垃圾数据库填充。

IN ( SELECT ... ) can be very slow. IN ( SELECT ... )可能非常慢。 Do your query as a single SELECT without subqueries. 将查询作为没有子查询的单个SELECT进行。 I assume all 3 columns are in the same table? 我假设所有3列都在同一张表中? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses: (否则,将增加复杂性。) WHERE子句可能会有3个IN ( constants )子句:

SELECT ...
    FROM tbl
    WHERE cityId IN (1,2,3...)
      AND customerId NOT IN (33,2323,34534...)
      AND age IN (5, 7, 72)

Have (at least): 具有(至少):

INDEX(cityId),
INDEX(age)

(Negated things are unlikely to be able to use an index.) (否定的事物不太可能使用索引。)

The query will use one of the indexes; 该查询将使用索引之一; having both will give the Optimizer a choice of which it thinks is better. 两者都将为优化器提供它认为更好的选择。

Or... 要么...

SELECT c.*
    FROM customers AS c
    JOIN cityEligible AS b  ON b.city = c.city
    JOIN customerEligibleAge AS ce  ON c.age = ce.age
    LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
    WHERE ex.customerId IS NULL

Suggested indexes (probably as PRIMARY KEY ): 建议的索引(可能是PRIMARY KEY ):

customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)

In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work. 为了进一步讨论,请为每个表提供SHOW CREATE TABLE ,为任何实际工作的查询提供EXPLAIN SELECT ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM