简体   繁体   中英

Django count unique items in field and group by parent field with field selection

I'm using Django 1.10, python 2.7, mysql and I'm trying to get a count of each unique value in a particular field, but I would like a summary count for each of the parent instances. I'm fairly new to Django and database design so please do let me know if I've got my structure wrong.

ie I have two models in question here

TestRun:
branch - Branch test was run on - charField
requester - Who initiated test - charField
commit - What commit id it is. - integerField
instance - differentiate between rerun of same commit - integerField

Each Testrun can have multiple test results

TestResult:
testrun - foreign key
testcase = charField
commit - same as in test run - duplicated for performance reasons
result = charField i.e pass, fail, blocked, errored etc

Now I would like to get a summary of results by instance: ie

 [{instance: 5, commit: 28000, resquester: bob, pass: 5,  fail: 4, blocked: 4, errored: 6},
  {instance: 500, commit: 28100, requester: sally, pass: 2, fail:3, blocked: 8, errored: 3}]

The method I have working so far is as follows:

  1. Get distinct list of commits
  2. For each commit get unique instances
  3. For each instance use python Counter for all results matching instance to get summary for instance. Append to a list.

The actual implementation I have is as follows. data_set is filtered by query params ie branch

commits = data_set.values_list('commit', flat=True).distinct()
for commit in commits:
    commit_data = data_set.filter(commit=commit).prefetch_related('testrun')
    instances = set()
    for tr in commits:
        instances.add((tr.test_run.instance, tr.test_run.id))
    for instance, tr_id in instances:
        summary = dict(Counter(obj.result for obj in commit_data if obj.testrun.instance == instance))
        data.append({'commit': commit, 'instance': instance, 'summary': summary, 'tr_id': tr_id})

The code works fine but it doesn't really scale. I have looked into the grouping that Django now has but I'm not quite able to group it this way. I have tried two ways:

  1. Got me a summary of results for all my data

     TestResult.objects.filter(test_run__branch=branch).values('result').annotate(count=Count('result'))

    [{'count': 400, 'result': 'blocked'}, {'count': 250, 'result': 'errored'} ...]

  2. The second got me a summary for each unique result but the summary was not not combined into a dictionary

     TestResult.objects.filter(test_run__branch=branch).values('result', 'test_run__commit', 'test_run__instance').annotate(count=Count('result'))

    [{'count': 3, 'result': 'blocked', 'instance': 5, 'commit': 2400}, {'count': 6, 'result: 'errored', 'instance': 5, 'commit': 2400}....]

These two are two the closest I've gotten to, but they are not quite there. I do know that when you use annotate with values preceding, the values determine the grouping, but what happens to field selection? How can you get both field selection and grouping together when doing a summary? Is there a better Django way then my current implementation.

I think this will do the trick:

TestRun.objects.
filter(branch=branch).values('commit', 'requester', 'instance').
annotate(result=F('testresult__result'), count=Count('testresult__result'))

Let me know if it worked for you!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM