简体   繁体   English

如何编写和优化LINQ to SQL中具有一对多关系的三表联接?

[英]How do I write and optimize a three table join with one to many relationships in LINQ to SQL?

Ideal Class Structure 理想的阶级结构

A Game has many Players, each of which has many Statistics. 一个游戏有很多玩家,每个玩家都有很多统计数据。 In other words, each List<Game> contains a List<Player> and each Player contains a List<Statistic> . 换句话说,每个List<Game>包含一个List<Player> ,每个Player包含一个List<Statistic>

Game -> Player1 -> Statistic1
                   ....
                   Statistic30
        ....
        Player10 -> Statistic1
                    ....
                    Statistic30

Basic Table Schema 基本表架构

Game
----
GameId (int)
Region (nvarchar(4))

Player
------
GameId (int)
Region (nvarchar(4))
AccountId (int)

Statistic
---------
GameId (int)
Region (nvarchar(4))
AccountId (int)

My Attempt 我的尝试

var b = (from g in db.Games
         select new GameDTO()
         {
             GameId = g.GameId,
             Players = (from p in db.PlayerGames
                        where p.GameId == g.GameId && p.Region.Equals(g.Region)
                        select new PlayerGameDTO()
                        {
                            AccountId = p.AccountId,
                            GameId = p.GameId,
                            Region = p.Region,
                            Statistics = (from r in db.Statistics
                                          where r.AccountId == p.AccountId && r.GameId == p.GameId && r.Region.Equals(p.Region)
                                        select r).ToList()
                        }).ToList()
         });

This solution (obviously) does not employ Join , largely because I'm not sure how to perform the Join s in the correct order to achieve the desired result. 这个解决方案(显然)没有使用Join ,主要是因为我不确定如何以正确的顺序执行Join以获得所需的结果。

I should mention that each day we aggregate ~100K new games, ~1M players, and ~30M statistics. 我应该提到,每天我们汇总约10万个新游戏,约100万玩家和约3000万统计数据。 The current query can select ~1.4 games per second and uses 99% of the hyper threaded quad core CPU. 当前查询每秒可以选择约1.4个游戏,并使用99%的超线程四核CPU。

If anything is muddy, please feel free to ask for clarification. 如果有什么泥泞的地方,请随时澄清。

Update #1 更新#1

var d = (from g in db.Games
         join p in db.PlayerGames on new { g.GameId, g.Region } equals new { p.GameId, p.Region }
         join r in db.Statistics on new { p.GameId, p.Region, p.AccountId } equals new { r.GameId, r.Region, r.AccountId }
         select new StatisticsDTO()
         {
             GameId = r.GameId, 
             AccountId = r.AccountId, 
             StatType = r.StatType,
             Value = r.Value
         });

Something this simple is churning out ~9K (22x faster than the original) rows per second. 这种简单的操作每秒会产生约9K(比原始速度快22倍)的行。 SQL Server is clearly doing all the work, using ~90% of the CPU. 显然,SQL Server正在使用90%的CPU来完成所有工作。 HOWEVER, instead of nested objects, I'm left with a 1 dimensional query. 但是,我没有嵌套的对象,而是一维查询。

If you have any suggestions on this update, I'd love to hear them. 如果您对此更新有任何建议,非常希望听到。

It sounds like it may be more appropriate to let your database handle some of this workload, especially if you're simply running queries and not writing to the database. 听起来让数据库处理其中的某些工作量似乎更合适,尤其是在您仅运行查询而不是写入数据库的情况下。 Consider creating a View in your database that implements the joins. 考虑在数据库中创建一个实现联接的视图。 Then you can query the View and avoid joining on your client machine. 然后,您可以查询视图,并避免加入客户端计算机。 You can still use the entity data model and LINQ to run queries against the view. 您仍然可以使用实体数据模型和LINQ对视图运行查询。 You should see a pretty good performance increase with this approach. 使用这种方法,您应该会看到相当不错的性能提升。

//Possible SQL for creating the view
CREATE VIEW vw_GameData AS 
SELECT g.GameId, g.Region, p.AccountId, etc...
FROM Game g JOIN Player p ON (g.GameId = p.GameId AND g.Region = p.Region)
JOIN Statistic s ON (s.GameId = p.GameId AND s.RegionId = p.RegionId AND s.AccountId = p.AccountId)

First try a simple linq join. 首先尝试一个简单的linq连接。

 Game ---- GameId (int) Region (nvarchar(4)) Player ------ GameId (int) Region (nvarchar(4)) AccountId (int) Statistic --------- GameId (int) Region (nvarchar(4)) AccountId (int) 
var b = (from t in db.Games
         join t1 in t.Player on t.GameId equals t1.GameId
         join t2 in t.Statistic on t.GameId equals t2.GameId
         select new PlayerGameDTO
         {
            AccountId = t1.AccountId,
            GameId = t1.GameId,
            Region = t1.Region,
            //RawStats <-- what are you trying to do here?
            //RawStats = (from r in db.RawStats
            //where r.AccountId == p.AccountId && r.GameId == p.GameId && r.Region.Equals(p.Region) select r).ToList()
         }).ToList();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM