python动态创建字典

Question

给定python中的字符串列表

logs = ["0001 3 95", "0001 5 90", "0001 5 100", "0002 3 95", "0001 7 80", "0001 8 80",
        "0001 10 90", "0002 10 90", "0002 7 80", "0002 8 80", "0002 5 100", "0003 99 90"] where 
s[0] = student ID, 
s[1] = problem ID,
s[2] = score for the problem

我想找出每个学生解决的问题数量是否相同。 前任。 学生 0001 解决了 6 个问题，学生 0002 解决了 5 个问题，但学生 0001 两次尝试了问题 #5。 所以学生 0001 和学生 0002 都解决了 2 个问题。 我还需要检查每个学生是否解决了相同的问题 # 并在尝试的问题上获得了相同的分数。 我怎么写这是pythonic代码？

Answer 1

为此，您将遍历字符串列表，并按空格拆分该字符串：

logs = ["0001 3 95", "0001 5 90", "0001 5 100", "0002 3 95", "0001 7 80", "0001 8 80",
        "0001 10 90", "0002 10 90", "0002 7 80", "0002 8 80", "0002 5 100", "0003 99 90"]
for log in logs:
    s = log.split(' ')

Answer 2

您将需要几个不同的分组（字典）来分析所有这些不同轴上的数据：

首先将信息整理到各个分组轴中：

logs = ["0001 3 95", "0001 5 90", "0001 5 100", "0002 3 95",
        "0001 7 80", "0001 8 80", "0001 10 90", "0002 10 90",
        "0002 7 80", "0002 8 80", "0002 5 100", "0003 99 90"]

students = dict() # {studentID: {problemID: max Score}} nested dictionaries
problems = dict() # {problemID: {studentIDs}} dictionary of sets
results  = dict() # {(problemID,result): {studentIDs}} matching results
for s,p,r in map(str.split,logs):
    scores = students.setdefault(s,dict()) # track problems per student
    scores[p] = max(scores.get(p,r),r)     # max score for student/problem
    problems.setdefault(p,set()).add(s)    # add student to problem's set
    results.setdefault((p,r),set()).add(s) # add student to problem/result

然后您可以查询这些数据结构以获得您正在寻找的洞察力。

原始分组：

# problems solved by each student with their maximum result
print(students)
{'0001': {'3': '95', '5': '90', '7': '80', '8': '80', '10': '90'},
 '0002': {'3': '95', '10': '90', '7': '80', '8': '80', '5': '100'},
 '0003': {'99': '90'}}

# list of students that solved each problem
print(problems)
{'3': {'0002', '0001'},
 '5': {'0002', '0001'},
 '7': {'0002', '0001'},
 '8': {'0002', '0001'},
 '10': {'0002', '0001'},
 '99': {'0003'}}

# list of students that got a specific result on each problem
print(results)
{('3', '95'): {'0002', '0001'}, ('5', '90'): {'0001'},
 ('5', '100'): {'0002', '0001'}, ('7', '80'): {'0002', '0001'},
 ('8', '80'): {'0002', '0001'}, ('10', '90'): {'0002', '0001'},
 ('99', '90'): {'0003'}}

通过聚合/过滤得出的信息：

# number of problems solved per student
print({s:len(pr) for s,pr in students.items()}) 
{'0001': 5, '0002': 5, '0003': 1}
    
# students that got the same score on the same problem (plagiarism?)
for (prob,result),students in results.items():
    if len(students)>1:
        print(f"# same result ({result}) on problem #{prob} :",students)

# same result (95) on problem #3 : {'0001', '0002'}
# same result (100) on problem #5 : {'0001', '0002'}
# same result (80) on problem #7 : {'0001', '0002'}
# same result (80) on problem #8 : {'0001', '0002'}
# same result (90) on problem #10 : {'0001', '0002'}

请注意，关系数据库通常是执行此类分析的更好工具。

python动态创建字典

问题描述

2 个解决方案

解决方案1
0 2021-10-31 13:46:20

解决方案2
0 2021-10-31 17:10:06

python动态创建字典

问题描述

2 个解决方案

解决方案1 0 2021-10-31 13:46:20

解决方案2 0 2021-10-31 17:10:06

解决方案1
0 2021-10-31 13:46:20

解决方案2
0 2021-10-31 17:10:06