简体   繁体   中英

select n elements from a pandas column whose sum is equal to a provided number

I have a pandas dataframe which has a score column and a question column. I want to select questions based on the following criteria:

  1. Number of questions
  2. cumulative score

ie

Get 5 questions whose sum of the scores should be equal to, let's say 10.

The data looks something like this:

index question score
1 A 1
2 B 1
3 C 1
4 D 1
5 E 2
6 F 2
7 G 2
8 H 2
9 I 2
10 J 1
11 K 4
12 L 6
13 M 7
14 N 3
15 O 2
16 P 5
17 Q 1
18 R 2
19 S 4

If the constraints are following

required questions = 5

required max score = 10

then the output should be:

index question score
1 A 1
2 B 1
3 C 1
4 D 1
10 J 1
16 P 5

Can anyone please suggest a solution?

thanks.

One way to do this:

import pandas as pd
from itertools import combinations

df = pd.DataFrame({'index': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10, 10: 11, 11: 12, 12: 13, 13: 14, 14: 15, 15: 16, 16: 17, 17: 18, 18: 19},
                   'question': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E', 5: 'F', 6: 'G', 7: 'H', 8: 'I', 9: 'J', 10: 'K', 11: 'L', 12: 'M', 13: 'N', 14: 'O', 15: 'P', 16: 'Q', 17: 'R', 18: 'S'},
                   'score': {0: 1, 1: 1, 2: 1, 3: 1, 4: 2, 5: 2, 6: 2, 7: 2, 8: 2, 9: 1, 10: 4, 11: 6, 12: 7, 13: 3, 14: 2, 15: 5, 16: 1, 17: 2, 18: 4}})

required_questions = 5
required_score_sum = 10

ind_quest_score_tups = [(i, q, s) for i, q, s in zip(df["index"], df["question"], df["score"])]
candidates = combinations(ind_quest_score_tups, required_questions)
candidates = [c for c in candidates if sum(s for i, q, s in c) == required_score_sum]

for c in candidates:
    print(pd.DataFrame.from_records(c, columns=["index", "question", "score"]))
    print()

Which gives you:

   index question  score
0      1        A      1
1      2        B      1
2      3        C      1
3      4        D      1
4     12        L      6

   index question  score
0      1        A      1
1      2        B      1
2      3        C      1
3      5        E      2
4     16        P      5

   index question  score
0      1        A      1
1      2        B      1
2      3        C      1
3      6        F      2
4     16        P      5

...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM