[英]iterating through multiple columns and appending data in pandas dataframe
[英]Iterating thru Pandas DataFrame, multiple columns
parsed_df
student_name, course_id, weight,
0 A 1 10
1 B 1 10
2 C 1 10
3 A 1 40
4 B 1 40
5 C 1 40
6 A 1 50
7 B 1 50
8 C 1 50
9 A 2 40
10 C 2 40
11 A 2 60
12 C 2 60
13 A 3 90
14 B 3 90
15 C 3 90
16 A 3 10
17 B 3 10
18 C 3 10
遍歷每個學生和 course_id 以總結每個學生每門課程的權重的最佳方法是什么? 權重的返回值應該每個加起來為 100,否則應該返回錯誤。
例如:calculated_df = parsed_df.groupby(['student_name','course_id'])['weight'].sum()
只是將總和與 100 進行比較?
>>> df.groupby(['student_name', 'course_id'])['weight'].sum().eq(100).reset_index()
student_name course_id weight
0 A 1 True
1 A 2 True
2 A 3 True
3 B 1 True
4 B 3 True
5 C 1 True
6 C 2 True
7 C 3 True
import pandas as pd
df = pd.DataFrame({
'student_name': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'C', 'A', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'course_id': [1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3],
'weight': [10, 10, 10, 40, 40, 40, 50, 50, 50, 40, 40, 60, 60, 90, 90, 90, 10, 10, 9]
})
df = df.groupby(by=['student_name', 'course_id']).agg({'weight': 'sum'})
df.reset_index(inplace=True)
print(df)
# student_name course_id weight
#0 A 1 100
#1 A 2 100
#2 A 3 100
#3 B 1 100
#4 B 3 100
#5 C 1 100
#6 C 2 100
#7 C 3 99 -> Intentionally adjusted to raise Exception
#To check if there is weight, which is not 100
for i, row in df.iterrows():
if row['weight'] != 100:
raise Exception(f"Student {row['student_name']}\'s Course ID {row['course_id']}: {row['weight']}")
Exception: Student C's Course ID 3: 99
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.