简体   繁体   English

多对多选择查询

[英]many to many select query

I'm trying to write code to pull a list of product items from a SQL Server database an display the results on a webpage. 我正在尝试编写代码以从SQL Server数据库中提取产品项列表,并在网页上显示结果。

A requirement of the project is that a list of categories is displayed at the right hand side of the page as a list of checkboxes (all categories selected by default) and a user can uncheck categories and re-query the database to view products's in only the categories they want. 项目的要求是在页面的右侧显示类别列表作为复选框列表(默认选择所有类别),用户可以取消选中类别并重新查询数据库以仅查看产品他们想要的类别。

Heres where it starts to get a bit hairy. 继承人开始有点毛茸茸的地方。

Each product can be assinged to multiple categories using a product categories table as below... 可以使用以下产品类别表将每种产品分成多个类别......

Product table
[product_id](PK),[product_name],[product_price],[isEnabled],etc...

Category table
[CategoryID](PK),[CategoryName]

ProductCagetory table

[id](PK),[CategoryID](FK),[ProductID](FK)

I need to select a list of products that match a set of category ID's passed to my stored procedure where the products have multiple assigned categories. 我需要选择一个产品列表,这些产品与传递给我的存储过程的一组类别ID相匹配,其中产品具有多个指定的类别。

The categort id's are passed to the proc as a comma delimited varchar ie ( 3,5,8,12 ) 将类别ID作为逗号分隔的varchar传递给proc,即(3,5,8,12)

The SQL breaks this varchar value into a resultset in a temp table for processing. SQL将此varchar值分解为临时表中的结果集以进行处理。

How would I go aout writing this query? 我怎么去写这个查询呢?

One problem is passing the array or list of selected categories into the server. 一个问题是将所选类别的数组或列表传递到服务器中。 The subject was covered at large by Eland Sommarskog in the series of articles Arrays and Lists in SQL Server . Eland Sommarskog 在SQL Server的一系列文章Arrays and Lists中对该主题进行了全面介绍。 Passing the list as a comma separated string and building a temp table is one option. 将列表作为逗号分隔的字符串传递并构建临时表是一种选择。 There are alternatives, like using XML, or a Table-Valued-Parameter (in SQL Server 2008) or using a table @variable instead of a #temp table. 还有其他选择,例如使用XML或Table-Valued-Parameter(在SQL Server 2008中)或使用表@variable而不是#temp表。 The pros and cons of each are covered in the article(s) I linked. 我所链接的文章涵盖了每种方法的优缺点。

Now on how to retrieve the products. 现在就如何检索产品。 First things first: if all categories are selected then use a different query that simply retrieves all products w/o bothering with categories at all. 首先要做的事情是:如果选择了所有类别,那么使用一个不同的查询,只需检索所有不打扰类别的产品。 This will save a lot of performance and considering that all users will probably first see a page w/o any category unselected, the saving can be significant. 这将节省大量性能,并且考虑到所有用户可能首先看到没有未选择任何类别的页面,节省可能很大。

When categories are selected, then building a query that joins products, categories and selected categories is fairly easy. 选择的类别,然后构建一个联接产品,类别以及所选类别的查询是相当容易的。 Making it scale and perform is a different topic, and is entirely dependent on your data schema and actual pattern of categories selected. 使其扩展和执行是一个不同的主题,完全取决于您的数据模式和所选类别的实际模式。 A naive approach is like this: 一个天真的方法是这样的:

select ...
from Products p
where p.IsEnabled = 1
and exists (
  select 1  
  from ProductCategories pc
  join #selectedCategories sc on sc.CategoryID = pc.CategoryID
  where pc.ProductID = p.ProductID);

The ProductsCategoriestable must have an index on (ProductID, CategoryID) and one on (CategoryID, ProductID) (one of them is the clustered, one is NC). ProductsCategoriestable必须在(ProductID, CategoryID)上有一个索引(ProductID, CategoryID)(CategoryID, ProductID)上有一个索引(其中一个是聚簇的,一个是NC)。 This is true for every solution btw. 这对每个解决方案都是如此。 This query would work if most categories are always selected and the result contains most products anyway. 如果始终选择大多数类别并且结果包含大多数产品,则此查询将起作用。 But if the list of selected categories is restrictive then is better to avoid the scan on the potentially large Products table and start from the selected categories: 但是,如果所选类别的列表具有限制性,则最好避免对可能较大的Products表进行扫描,并从所选类别开始:

with distinctProducts as (
select distinct pc.ProductID
from ProductCategories pc
join #selectedCategories sc on pc.CategoryID = sc.CategoryID)
select p.*
from Products p
join distinctProducts dc on p.ProductID = dc.ProductID;

Again, the best solution depends largely on the shape of your data. 同样,最佳解决方案在很大程度上取决于数据的形状。 For example if you have a very skewed category (one categoru alone covers 99% of products) then the best solution would have to account for this skew. 例如,如果您有一个非常偏斜的类别(仅一个类别涵盖了99%的产品),那么最佳解决方案就必须考虑到这种偏差。

This gets all products that are at least in all of the desired categories (no less): 这使得所有产品至少处于所有期望的类别(不低于):

select * from product p1 join (
  select p.product_id from product p 
  join ProductCategory pc on pc.product_id = p.product_id
  where pc.category_id in (3,5,8,12)
  group by p.product_id having count(p.product_id) = 4
) p2 on p1.product_id = p2.product_id

4 is the number of categories in the set. 4是集合中的类别数。

This gets all products that are exactly in all of the desired categories (no more, no less): 这将使所有产品完全符合所有期望的类别(不多也不少):

select * from product p1 join (
  select product_id from product p1 
  where not exists (
    select * from product p2 
    join ProductCategory pc on pc.product_id = p2.product_id
    where p1.product_id = p2.product_id
    and pc.category_id not in (3,5,8,12)
  )
  group by product_id having count(product_id) = 4
) p2 on p1.product_id = p2.product_id

The double negative can be read as: get all products for which there are no categories that are not in the desired category list. 双重否定可以理解为:获取没有类别不在所需类别列表中的所有产品。

For the products in any of the desired categories, it's as simple as: 对于任何所需类别的产品,它很简单:

select * from product p1 where exists (
  select * from product p2 
  join ProductCategory pc on pc.product_id = p2.product_id
  where 
    p1.product_id = p2.product_id and
    pc.category_id in (3,5,8,12)
)

This should do. 这应该做。 Yo don't have to break the comma delimited category ids. 哟不必打破逗号分隔的类别ID。

select distinct p.* 
from product p, productcategory pc
where p.product_id = pc.productid
and pc.categoryid in ( place your comma delimited category ids here)

This will give the products which are in any of the passed in category ids ie, as per JNK's comment its an OR not ALL. 这将给出任何传递类别ID的产品,即根据JNK的评论,它是OR而不是ALL。 Please specify if you want an AND ie, the product needs to be selected only if it is in ALL the categories specified in the comma separated list. 请指定是否需要AND,即只有在逗号分隔列表中指定的所有类别中才需要选择产品。

If you need anything else than product_id from products then you can write something like this (and adding the extra fields that you need): 如果您需要除product_id之外的其他产品,那么您可以编写类似这样的内容(并添加您需要的额外字段):

SELECT distinct(p.product_id)
FROM product_table p
JOIN productcategory_table pc
ON p.product_id=pc.product_id
WHERE pc.category_id in (3,5,8,12);

on the other hand if you need really just the product_id you can simply select them from productcategory_table: 另一方面,如果您只需要product_id,只需从productcategory_table中选择它们即可:

SELECT distinct(product_id)
FROM productcategory_table
WHERE category_id in (3,5,8,12);

This should be fairly close to what you are looking for 这应该与您正在寻找的相当接近

SELECT product.*
FROM   product
JOIN   ProductCategory ON ProductCategory.ProductID = Product.product_id
JOIN   #my_temp ON #my_temp.category_id = ProductCategory.CategoryID

EDIT 编辑

As noted in the comments this will produce duplicates for those products appearing in multiple categories. 如评论中所述,这将为出现在多个类别中的那些产品产生重复。 To correct this then specify DISTINCT before the column list. 要更正此问题,请在列列表前指定DISTINCT I have included all product columns in the list product.* as I do not know which columns you are looking for but you should probably change that to the specific columns that you want 我在列表product.*包含了所有产品列product.*因为我不知道您要查找哪些列但是您应该将其更改为您想要的特定列

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM