简体   繁体   中英

adding list in a new column in a pandas dataframe

I would like to create a new column with a set of datas for each line of a specific column, creating a breakdown to the first level. eg:

 Level
   1
   2
   3

 Level  Breakdown
   1
           a
           b
           c
           d
   2   
           a
           b
           c
           d
   3
           a
           b
           c
           d 

any hints how do I code this breakdown on pandas dataframe?

I understand that Multiindex must have an array to match with the 'Breakdown'. But the dataframe has more than 10 thousand registers, how can I combine my tuples with this amount of range?

In fact, I've got a raw database that I have to rearrange like a schedule format. Hereunder, I can share a small sample: enter image description here

So, I would like to rearrange the database in this format: enter image description here

You can achive this with Multiindex

But you will need an extra index for a,b,c,d

arrays = [["1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3"],
      ["a", "b", "c", "a", "b", "c", "a", "b", "c"]] 
tuples = list(zip(*arrays))

index = pandas.MultiIndex.from_tuples(tuples, names=['Levels','Breakdown'])

s = pandas.DataFrame("your_data", index=index)

With the data used in the documentation your dataframe would look like:

Levels Breakdown          
1      a         -0.985654
       b          0.782516
       c         -0.896590
2      a          0.841488
       b         -0.577790
       c         -1.130534
       a          0.587779
3      b         -0.935374
       c          1.658043

EDIT:

Since you edited your question I have come up with a sulotion. For details check out this question and the pandas documentation on pandas.DataFrame.stack.

Since you only posted pictures instead of copying data I didn't use your values. But my sample data looks like this:

d = {"Line": ["foo", "bar", "baz"], "CUT START": ["a", "b", "c"], 
"CUT FINISH": ["x", "y", "z"],
"END START" :[1, 2, 3], "END FINISH": [4, 5, 6]}

   Line   CUT START CUT FINISH  END START  END FINISH
0  foo         a          x          1           4
1  bar         b          y          2           5
2  baz         c          z          3           6

I transformed it like this:

# Set line as index
df = df.set_index("Line")

activitys = ["CUT", "END"]  # Add the rest of your activitys here
status = ["START", "FINISH"]

df.columns = pandas.MultiIndex.from_product([activitys, status])

This returns:

           CUT          END       
     START FINISH START FINISH
Line                          
foo      a      x     1      4
bar      b      y     2      5
baz      c      z     3      6

Then you can stack.

df = df.stack(0)

         FINISH START
Line                 
foo  CUT      x     a
     END      4     1
bar  CUT      y     b
     END      5     2
baz  CUT      z     c
     END      6     3

Now you only need to reorder

df.columns = ["START", "FINISH"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM