I would like to create a new column with a set of datas for each line of a specific column, creating a breakdown to the first level. eg:
Level
1
2
3
Level Breakdown
1
a
b
c
d
2
a
b
c
d
3
a
b
c
d
any hints how do I code this breakdown on pandas dataframe?
I understand that Multiindex must have an array to match with the 'Breakdown'. But the dataframe has more than 10 thousand registers, how can I combine my tuples with this amount of range?
In fact, I've got a raw database that I have to rearrange like a schedule format. Hereunder, I can share a small sample: enter image description here
So, I would like to rearrange the database in this format: enter image description here
You can achive this with Multiindex
But you will need an extra index for a,b,c,d
arrays = [["1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3"],
["a", "b", "c", "a", "b", "c", "a", "b", "c"]]
tuples = list(zip(*arrays))
index = pandas.MultiIndex.from_tuples(tuples, names=['Levels','Breakdown'])
s = pandas.DataFrame("your_data", index=index)
With the data used in the documentation your dataframe would look like:
Levels Breakdown
1 a -0.985654
b 0.782516
c -0.896590
2 a 0.841488
b -0.577790
c -1.130534
a 0.587779
3 b -0.935374
c 1.658043
EDIT:
Since you edited your question I have come up with a sulotion. For details check out this question and the pandas documentation on pandas.DataFrame.stack.
Since you only posted pictures instead of copying data I didn't use your values. But my sample data looks like this:
d = {"Line": ["foo", "bar", "baz"], "CUT START": ["a", "b", "c"],
"CUT FINISH": ["x", "y", "z"],
"END START" :[1, 2, 3], "END FINISH": [4, 5, 6]}
Line CUT START CUT FINISH END START END FINISH
0 foo a x 1 4
1 bar b y 2 5
2 baz c z 3 6
I transformed it like this:
# Set line as index
df = df.set_index("Line")
activitys = ["CUT", "END"] # Add the rest of your activitys here
status = ["START", "FINISH"]
df.columns = pandas.MultiIndex.from_product([activitys, status])
This returns:
CUT END
START FINISH START FINISH
Line
foo a x 1 4
bar b y 2 5
baz c z 3 6
Then you can stack.
df = df.stack(0)
FINISH START
Line
foo CUT x a
END 4 1
bar CUT y b
END 5 2
baz CUT z c
END 6 3
Now you only need to reorder
df.columns = ["START", "FINISH"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.