简体   繁体   中英

Way to expand rows for range of cell value in Python?

I have a dataset that looks like this.

    data = {'doc_ID':['Monday', 'Tuesday', 'Wednesday'], 'attachmentCount':[3,0,2], 'open':['TRUE','TRUE','FALSE']}
    df = pd.DataFrame(data)
    df

        doc_ID  attachmentCount open
    0   Monday      3           TRUE
    1   Tuesday     0           TRUE
    2   Wednesday   2           FALSE

I want to expand the dataset by inserting a row for every integer between 1 and the max of "attachmentCount". So, it should look something like this:

        doc_ID  attachmentCount   open
   0    Monday         1          TRUE
   1    Monday         2          TRUE
   2    Monday         3          TRUE
   3    Tuesday        0          TRUE
   4    Wednesday      1          FALSE
   5    Wednesday      2          FALSE

I've tried a couple different things that are so wildly incorrect, that they are not worth posting here. Anyone have any suggestions? Thank you.

I don't know pandas, but in pure python, the following code gives the output you need.

a = [[0, 'Monday', 3, True], [1, 'Tuesday', 0, True], [2, 'Wednesday', 2, False]]
[[[x[0],x[1],y,x[3]] for y in set(range(1,x[2]+1)+[x[2]])] for x in a]

Explaining the code:

a is the dataset.

x is each row of the dataset.

Therefore, in the inner List Comprehension , range(1,x[2]+1)+[x[2]]) , is the all the integers from 1 to the attachmentCount and whatever the attachmentCount is. This is needed as the attachmentCount can be lesser than 1, like 0 in your case. This is converted to a set to remove duplicates.

The innermost part merely substitutes the attachmentCount with each element of the newly created set of values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM