简体   繁体   中英

Join datatables with LINQ, select multiple columns and sums with group by multiple columns

I am joining two datatables with LINQ and trying to select data from both tables, including sum of one column grouped by two columns. The data structure seems like this.

table1 (Demands)
  propertyID
  propertyGroupID
  supplierID
  demand

propertyID  |propertyGroupID    |supplierId |demand |ContractId
13          |3                  |3          |2      |1
22          |4                  |3          |1      |1
21          |5                  |3          |12     |1
15          |5                  |3          |3      |1
16          |7                  |3          |16     |1
23          |5                  |3          |5      |1


table2 (Supplies)
  supplierID
  propertyID
  supply

supplierId  |propertyID |supply
4           |23         |2764
1           |22         |3521
1           |16         |2533
11          |23         |876
4           |21         |5668

I would like to have the result as

supplierID
propertyGroupID
sum(supply)
sum(demand)

These would be grouped by supplierID and propertyGroupID. So, finally I would like to compare actual supplies to the demand by each supplier and property group.

What I have done so far is

var result = from demandItems in table1.AsEnumerable()
             join supplyItems in table2.AsEnumerable() on
Convert.ToInt16(demandItems["propertyID"]) equals 
Convert.ToInt16(supplyItems["propertyID"])
             group new
             {
                demandItems,
                supplyItems
            }
            by new
            {
                Supplier = supplyItems.Field<string>("supplierID"),
                PropertyGroup = demandItems.Field<int>("propertyGroupID")
            }
            into groupDt
             select new
             {
                SupplierID = groupDt.Key.Supplier,
                PropertyGroupId = groupDt.Key.PropertyGroup,
                SumOfSupply = groupDt.Sum(g => g.supplyItems.Field<double>("supply")),
            SumOfDemand = groupDt.Sum(g => g.demandItems.Field<double>("demand"))
        };

This works fine and I get correct sum of supply grouped by different suppliers and property groups. However, the sum of demand is not correct.

SupplierID  |PropertyGroupID    |SumOfSupply    |SumOfDemand
4           |5                  |8432           |17
1           |4                  |3521           |1
1           |7                  |2533           |16
11          |5                  |876            |5

As you see, table1 has only one supplier (ID=3). The correct result data should be

SupplierID  |PropertyGroupID    |SumOfSupply    |SumOfDemand
4           |5                  |8432           |0
1           |4                  |3521           |0
1           |7                  |2533           |0
11          |5                  |876            |0
3           |3                  |0              |2
3           |4                  |0              |1              
3           |5                  |0              |20
3           |7                  |0              |16

How to obtain the result I want?

EDIT 2018-01-05

I'm using NetMage's solution to solve my problem. However, I get an error message from

var result = from d in table1sum.Concat(table2sum)...

Error CS1929 'EnumerableRowCollection<>' does not contain a definition for 'Concat' and the best extension method overload 'Queryable.Concat<>(IQueryable<>, IEnumerable<>)' requires a receiver of type 'IQueryable<>'

Is this propably due to the fact that my original tables are actually read from database?

DataTable table1 = DemandDataSet.Tables[0];
DataTable table2 = SupplyDataSet.Tables[0];

For example, I had to use notation

var table1sum = table1.AsEnumerable().Select(d => new
            { propertyGroupID = (int)d["propertyGroupID"],
            supplierId = (int)d["supplierId"],
            demand = (double)d["demand"],
            supply = 0 });

Edit 2 2018-01-05

The error was due to type differences between fields in tables table1sum and table2sum. Specifically, fields "supply" and "demand" had different types in comparing tables. When I changed

demand = 0.0

and

supply = 0.0

compiler found .Concat -method.

Your data tables are not normalized, which I think is causing your issue. You don't really want to combine the supply and demand tables.

First, summarize the demand table to the information needed for the result:

var table1sum = table1.AsEnumerable().Select(d => new { d.propertyGroupID, d.supplierId, d.demand, supply = 0 });

Second, summarize the supply table to the information needed for the result, using the demand table to get the propertyGroupID needed:

var table2sum = from supplyItem in table2.AsEnumerable()
                join demandItem in table1.AsEnumerable() on supplyItem.propertyID equals demandItem.propertyID
                select new {
                    demandItem.propertyGroupID,
                    supplyItem.supplierId,
                    demand = 0,
                    supplyItem.supply
                };

Now you can combine the summaries and group to get the desired result:

var result = from d in table1sum.Concat(table2sum)
             group d by new { d.supplierId, d.propertyGroupID } into dg
             select new {
                 SupplierID = dg.Key.supplierId,
                 PropertyGroupId = dg.Key.propertyGroupID,
                 SumOfSupply = dg.Sum(g => g.supply),
                 SumOfDemand = dg.Sum(g => g.demand)
             };

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM