简体   繁体   中英

Create a data frame in Python containing a combination of all values from three lists

So I have two lists: gender = ['Male', 'Female'] and subject = ['Math3_Exam_Mark', 'Math6_Exam_Mark', 'Math9_Exam_Mark', 'ELA3_Exam_Mark', 'ELA6_Exam_Mark', 'ELA9_Exam_Mark'] , plus an ndarray birthMonthYear containing a list of dates extracted from a CSV file.

I'd like to create a new data frame with three columns: gender, subject, birthMonthYear. There should be rows for every combination of gender, subject and birthMonthYear.

Is there an easy way to do this, perhaps with pandas? I imagine I could create nested foreach loops that ran through each list to create the data frame, but if there is something simpler I'd like to try it.

Thank you for your help!

Setup

gender = ['Male', 'Female']
subject = ['Math3_Exam_Mark', 'Math6_Exam_Mark', 'Math9_Exam_Mark',
           'ELA3_Exam_Mark', 'ELA6_Exam_Mark', 'ELA9_Exam_Mark']
birthMonthYear = pd.date_range('2010-01-31', periods=2, freq='M')

Option 1
itertools.product

from itertools import product

pd.DataFrame(
    list(product(gender, subject, birthMonthYear)),
    columns=['Gender', 'Subject', 'BirthMonthYear']
)

    Gender          Subject BirthMonthYear
0     Male  Math3_Exam_Mark     2010-01-31
1     Male  Math3_Exam_Mark     2010-02-28
2     Male  Math6_Exam_Mark     2010-01-31
3     Male  Math6_Exam_Mark     2010-02-28
4     Male  Math9_Exam_Mark     2010-01-31
5     Male  Math9_Exam_Mark     2010-02-28
6     Male   ELA3_Exam_Mark     2010-01-31
7     Male   ELA3_Exam_Mark     2010-02-28
8     Male   ELA6_Exam_Mark     2010-01-31
9     Male   ELA6_Exam_Mark     2010-02-28
10    Male   ELA9_Exam_Mark     2010-01-31
11    Male   ELA9_Exam_Mark     2010-02-28
12  Female  Math3_Exam_Mark     2010-01-31
13  Female  Math3_Exam_Mark     2010-02-28
14  Female  Math6_Exam_Mark     2010-01-31
15  Female  Math6_Exam_Mark     2010-02-28
16  Female  Math9_Exam_Mark     2010-01-31
17  Female  Math9_Exam_Mark     2010-02-28
18  Female   ELA3_Exam_Mark     2010-01-31
19  Female   ELA3_Exam_Mark     2010-02-28
20  Female   ELA6_Exam_Mark     2010-01-31
21  Female   ELA6_Exam_Mark     2010-02-28
22  Female   ELA9_Exam_Mark     2010-01-31
23  Female   ELA9_Exam_Mark     2010-02-28

Option 2
pd.MultiIndex.from_product

idx = pd.MultiIndex.from_product(
    [gender, subject, birthMonthYear],
    names=['Gender', 'Subject', 'BirthMonthYear']
)

pd.DataFrame(index=idx).reset_index()

    Gender          Subject BirthMonthYear
0     Male  Math3_Exam_Mark     2010-01-31
1     Male  Math3_Exam_Mark     2010-02-28
2     Male  Math6_Exam_Mark     2010-01-31
3     Male  Math6_Exam_Mark     2010-02-28
4     Male  Math9_Exam_Mark     2010-01-31
5     Male  Math9_Exam_Mark     2010-02-28
6     Male   ELA3_Exam_Mark     2010-01-31
7     Male   ELA3_Exam_Mark     2010-02-28
8     Male   ELA6_Exam_Mark     2010-01-31
9     Male   ELA6_Exam_Mark     2010-02-28
10    Male   ELA9_Exam_Mark     2010-01-31
11    Male   ELA9_Exam_Mark     2010-02-28
12  Female  Math3_Exam_Mark     2010-01-31
13  Female  Math3_Exam_Mark     2010-02-28
14  Female  Math6_Exam_Mark     2010-01-31
15  Female  Math6_Exam_Mark     2010-02-28
16  Female  Math9_Exam_Mark     2010-01-31
17  Female  Math9_Exam_Mark     2010-02-28
18  Female   ELA3_Exam_Mark     2010-01-31
19  Female   ELA3_Exam_Mark     2010-02-28
20  Female   ELA6_Exam_Mark     2010-01-31
21  Female   ELA6_Exam_Mark     2010-02-28
22  Female   ELA9_Exam_Mark     2010-01-31
23  Female   ELA9_Exam_Mark     2010-02-28

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM