[英]Trouble with linq/GroupBy/Count
I have the following data set:我有以下数据集:
vehicleid elapsedTimeLastConnection timestampLastConnection avgConnectionDuration
20472 1054471 2021-04-01 08:45:29.000 195
20400 1048824 2021-04-01 10:19:36.777 2522
20464 1048764 2021-04-01 10:20:36.000 1065
26235 1042766 2021-04-01 12:00:34.000 1028
20472 1029448 2021-04-01 15:42:32.000 1168
20464 983912 2021-04-02 04:21:28.777 37
20417 974218 2021-04-02 07:03:02.000 15031
20422 966875 2021-04-02 09:05:25.777 3
20422 962542 2021-04-02 10:17:38.777 2922
26235 961541 2021-04-02 10:34:19.000 1137
20464 961189 2021-04-02 10:40:11.000 5362
20472 939075 2021-04-02 16:48:45.777 763
20473 931086 2021-04-02 19:01:54.777 3428
20472 885385 2021-04-03 07:43:35.000 1683
20412 878456 2021-04-03 09:39:04.777 1601
20400 875267 2021-04-03 10:32:13.000 322
20398 871287 2021-04-03 11:38:33.777 1035
26235 863747 2021-04-03 13:44:13.000 1322
20400 845021 2021-04-03 18:56:19.000 2471
20410 811539 2021-04-04 04:14:21.777 1
20410 801662 2021-04-04 06:58:58.000 1403
20424 787282 2021-04-04 10:58:38.777 220
20472 783425 2021-04-04 12:02:55.777 1010
26235 777413 2021-04-04 13:43:07.000 971
20451 776365 2021-04-04 14:00:35.777 30
20422 774753 2021-04-04 14:27:27.777 10
20451 774458 2021-04-04 14:32:22.000 0
20422 770654 2021-04-04 15:35:46.777 199
20424 768515 2021-04-04 16:11:25.777 158
20424 758100 2021-04-04 19:05:00.777 3420
20422 757804 2021-04-04 19:09:56.777 1974
26431 749178 2021-04-04 21:33:42.777 3
26431 744800 2021-04-04 22:46:40.777 1
26431 743230 2021-04-04 23:12:50.777 3
20473 725451 2021-04-05 04:09:09.000 1
26431 724816 2021-04-05 04:19:44.777 47
20473 724478 2021-04-05 04:25:22.777 2232
20472 722822 2021-04-05 04:52:58.000 1
26431 716665 2021-04-05 06:35:35.777 258
20410 714575 2021-04-05 07:10:25.777 1750
26235 705768 2021-04-05 09:37:12.000 440
20472 705134 2021-04-05 09:47:46.777 1576
26431 693675 2021-04-05 12:58:45.000 1
20398 677341 2021-04-05 17:30:59.000 3688
26431 676935 2021-04-05 17:37:45.000 1
26431 676014 2021-04-05 17:53:06.777 1075
26235 674789 2021-04-05 18:13:31.777 7
26235 673755 2021-04-05 18:30:45.000 802
20400 671561 2021-04-05 19:07:19.777 529
20464 634465 2021-04-06 05:25:35.777 1
20400 627857 2021-04-06 07:15:43.777 1274
26235 623214 2021-04-06 08:33:06.000 2679
20422 621451 2021-04-06 09:02:29.777 1
20422 620461 2021-04-06 09:18:59.777 4185
20464 611819 2021-04-06 11:43:01.777 1021
26431 611458 2021-04-06 11:49:02.000 1446
20472 609710 2021-04-06 12:18:10.777 1360
20410 600170 2021-04-06 14:57:10.777 12
20410 589821 2021-04-06 17:49:39.777 610
20473 585735 2021-04-06 18:57:45.000 1004
20451 583418 2021-04-06 19:36:22.777 2
I'm grouping by day of week with the following linq query:我使用以下 linq 查询按星期几分组:
var toBeReturned = dataSet
.GroupBy(row => new DateTime(row.timestampLastConnection.Ticks).ToLocalTime().DayOfWeek);
Which gives me exactly what I want.这正是我想要的。 So far so good.到目前为止,一切都很好。
Now I want to count the distinct vehicleIds per group, so I ended up with:现在我想计算每组不同的车辆ID,所以我最终得到:
var toBeReturned2 = toBeReturned
.Select(g =>
{
int vCount = g.Select(c => c.vehicleId).Distinct().Count();
return new
{
DayOfWeek = g.Key,
count = vCount,
duration = g.Average(c => c.avgConnectionDuration) / vCount
};
});
The problem is that vCount is always 1 instead of beeing the count of distinct vehicleIds for the selected group.问题是vCount 始终为 1 ,而不是选择组的不同车辆 ID 的计数。
{ DayOfWeek = Thursday, count = 1, duration = 997.90909090909088 }
{ DayOfWeek = Friday, count = 1, duration = 2124.2380952380954 }
{ DayOfWeek = Saturday, count = 1, duration = 1329.1666666666667 }
{ DayOfWeek = Sunday, count = 1, duration = 657.05882352941171 }
{ DayOfWeek = Monday, count = 1, duration = 642 }
{ DayOfWeek = Tuesday, count = 1, duration = 1132.9166666666667 }
{ DayOfWeek = Wednesday, count = 1, duration = 891.81818181818187 }
What Am I doing wrong?我究竟做错了什么?
Try narrowing the problem down.尝试缩小问题范围。 If you remove .Distinct()
does it return the full count of each group?如果删除.Distinct()
它会返回每个组的完整计数吗? If so, maybe the issue is the contents of your vehicleId
column.如果是这样,问题可能是您的vehicleId
列的内容。
Per comment: the field isn't being deserialized into the object (wrong name) and so all the values are null or empty and they are thus not distinct.每条评论:该字段没有被反序列化为 object (错误名称),因此所有值都是 null 或空,因此它们不是不同的。
Try following code:试试下面的代码:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Data;
namespace ConsoleApplication187
{
class Program
{
const string FILENAME = @"c:\temp\test.txt";
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("vehicleid", typeof(int));
dt.Columns.Add("elapsedTimeLastConnection", typeof(long));
dt.Columns.Add("timestampLastConnection", typeof(DateTime));
dt.Columns.Add("avgConnectionDuration", typeof(long));
int col1Len = "vehicleid ".Length;
int col2Len = "elapsedTimeLastConnection ".Length;
int col3Len = "timestampLastConnection ".Length;
int col4Len = "avgConnectionDuratio".Length;
StreamReader reader = new StreamReader(FILENAME);
string line = "";
int rowCount = 0;
while((line = reader.ReadLine()) != null)
{
if (++rowCount > 1)
{
dt.Rows.Add(new object[] {
int.Parse(line.Substring(0,col1Len)),
long.Parse(line.Substring(col1Len,col2Len)),
DateTime.Parse(line.Substring(col1Len + col2Len,col3Len)),
long.Parse(line.Substring(col1Len + col2Len + col3Len))
});
}
}
var toBeReturned = dt.AsEnumerable().GroupBy(row => row.Field<DateTime>("timestampLastConnection").ToLocalTime().DayOfWeek);
var toBeReturned2 = toBeReturned.OrderBy(x => x.Key).Select(x => new { key = x.Key, count = x.Count() }).ToList();
}
}
}
One possible answer is that your vehicleId
is not fetched and defaulted to 0
or null
.一个可能的答案是您的vehicleId
未获取并默认为0
或null
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.