简体   繁体   中英

How to find all different lists and place them in another list

I have a.csv file with 1500 lines and each line looks like this:

CS105,A-,ENG101,A-,MATH101,A,GER,B,ENG102,B,CS230,B,MATH120,B,GER,A-,CS205,A,FREE,A-,GER,A-,CS106,B,CS215,B+,CS107,A,ENG204,A-,GER,A-,MATH220,B+,CS300,B,CS206,A,CS306,B+,GER,A-,FREE,B+,CS312,A,CS450,B,GER,B,CS321,B,FREE,A,CS325,A-,GER,B+,CS322,B+,MAJOR,A-,CS310,B,STAT205,A-,A-,CS443,B+,CS412,A,CS421,B+,GER,A-,CS444,B+,FREE,A-,FREE,B,A,B,A-

and next line like this:

CS105,A-,ENG101,A,MATH101,B,GER,A,ENG102,A-,CS230,B+,MATH120,A-,GER,B,CS205,B+,GER,B+,A,CS106,A-,CS107,B+,CS215,A-,ENG204,A,GER,B,MATH220,A-,CS206,A-,FREE,A-,CS300,B,GER,B+,A,CS312,A,CS450,A-,GER,B,CS321,A,FREE,A-,CS325,B+,CS306,B,CS310,B+,MAJOR,A,GER,A,STAT205,B,B,CS443,A,CS322,B,GER,A,FREE,B,CS444,A,CS412,A,CS421,B+,FREE,A,FREE,B,A-

and it goes on. The CS105 indicates a course, while A- indicates the grade that the student obtained in that specific class. What I did is to seperate the courses from the grades and I did this

courses = []
grades = []

def DatabaseToList():

    with open('grades.csv', newline='') as f:
        reader = csv.reader(f)
        mylist = list(reader)
        # print(mylist)

        path = mylist[0:]
        for lists in path:
            for onelist in lists:
                course = lists[::2]
                grade = lists[1::2]
                courses.append(course)
                grades.append(grade)

That way I get this result(I am just showing the courses[0] and grades[0]):

['CS105', 'ENG101', 'MATH101', 'GER', 'ENG102', 'CS230', 'MATH120', 'GER', 'CS205', 'FREE', 'GER', 'CS106', 'CS215', 'CS107', 'ENG204', 'GER', 'MATH220', 'CS300', 'CS206', 'CS306', 'GER', 'FREE', 'CS312', 'CS450', 'GER', 'CS321', 'FREE', 'CS325', 'GER', 'CS322', 'MAJOR', 'CS310', 'STAT205', '', 'CS443', 'CS412', 'CS421', 'GER', 'CS444', 'FREE', 'FREE']
['A-', 'A-', 'A', 'B', 'B', 'B', 'B', 'A-', 'A', 'A-', 'A-', 'B', 'B+', 'A', 'A-', 'A-', 'B+', 'B', 'A', 'B+', 'A-', 'B+', 'A', 'B', 'B', 'B', 'A', 'A-', 'B+', 'B+', 'A-', 'B', 'A-', '', 'B+', 'A', 'B+', 'A-', 'B+', 'A-', 'B']

What I want to do is find all the possible combinations of courses that there are in the courses array in order to find the average GPA of people that took the exact same courses in the exact same sequence(from the grades array). To be more precise, I want to find all the combinations that exist and put them only once in a seperate list so afterwards I can compare each list with all lists from the array in order to collect the grades of the same ones and find what GPA each student had and then find the average one. I searched for how to find unique lists and duplicate lists but it takes a great amount of time to run and show results and it is not what I actually ask for. The code I used is the following:

unique_list = []
duplicate_list = []
for i in courses:
    final_list = [unique_list.append(item) if item not in unique_list else duplicate_list.append(item) for item in
                          courses]

In other words, in order to make it maybe easier to understand, if there is an array of courses that has:

courses = [[A,B,C],[A,B,C],[A,C,B],[B,C,A],[B,A,C],[B,A,C]]

then i want a new list that will contain the following

 allUnique = [[A,B,C],[A,C,B],[B,C,A],[B,A,C]]

The fastest solution is to use itertools :

import itertools

courses = [[A,B,C],[A,B,C],[A,C,B],[B,C,A],[B,A,C],[B,A,C]]
courses.sort()
allUnique = list(course for course, _ in itertools.groupby(courses))
print(allUnique)  # [[A,B,C],[A,C,B],[B,C,A],[B,A,C]]

To take a list,

a = [[A,B], [A,B], [B,C]]

into

[[A,B], [B,C]]

you can do

list(set(map(tuple,a)))

Since you can't use set on 2d arrays, you first convert it into a tuple, then perform the set function on that, and finally convert it back into a list.

Hope that helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM