I have the following dataframe:
ID,SomeValue,FooA1,FooA2,FooA3,FooB1,FooB2,FooB3,BarA1,BarA2,BarA3,BarB1,BarB2,BarB3
1 ,val1 ,4 ,7 ,2 ,8 ,1 ,3 ,2 ,9 ,2 ,0 ,9 ,2
2 ,val2 ,2 ,3 ,8 , , , ,1 ,5 ,3 , , ,
.
.
And I would like to merge the columns "[Foo|Bar][A|B]\\d+" so that they become the following, ie the different combinations of the multiple columns are merged and appropriate new columns are created to contain the variable representing those variations:
ID,SomeValue,FooBar ,AB ,Num ,Val
1 ,val1 ,Foo ,A ,1 ,4
1 ,val1 ,Foo ,A ,2 ,7
1 ,val1 ,Foo ,A ,3 ,2
1 ,val1 ,Foo ,B ,1 ,8
1 ,val1 ,Foo ,B ,2 ,1
1 ,val1 ,Foo ,B ,3 ,3
1 ,val1 ,Bar ,A ,1 ,2
1 ,val1 ,Bar ,A ,2 ,9
1 ,val1 ,Bar ,A ,3 ,2
1 ,val1 ,Bar ,B ,1 ,0
1 ,val1 ,Bar ,B ,2 ,9
1 ,val1 ,Bar ,B ,3 ,2
2 ,val2 ,Foo ,A ,1 ,2
2 ,val2 ,Foo ,A ,2 ,3
2 ,val2 ,Foo ,A ,3 ,8
2 ,val2 ,Bar ,A ,1 ,1
2 ,val2 ,Bar ,A ,2 ,5
2 ,val2 ,Bar ,A ,3 ,3
Note that there can be empty values, as for example in row 2 above and those should not be included in the final set.
This must be fairly simple to do, but I'm new to pandas and am struggling to find the right commands to use.
Thanks in advance for your help.
You can use:
DataFrame.set_index
with unstack
for reshape, last index
to column by reset_index
DataFrame.pop
for extrah column with str.extract
for parse by regex reindex_axis
for change columns order df = df.set_index(['ID','SomeValue']).stack().reset_index(name='Val')
df[['FooBar','AB','Num']] = df.pop('level_2').str.extract('(Foo|Bar)(A|B)(\d+)', expand=True)
cols = ['ID', 'SomeValue', 'FooBar', 'AB', 'Num','Val']
df = df.reindex_axis(cols, axis=1)
print (df)
ID SomeValue FooBar AB Num Val
0 1 val1 Foo A 1 4.0
1 1 val1 Foo A 2 7.0
2 1 val1 Foo A 3 2.0
3 1 val1 Foo B 1 8.0
4 1 val1 Foo B 2 1.0
5 1 val1 Foo B 3 3.0
6 1 val1 Bar A 1 2.0
7 1 val1 Bar A 2 9.0
8 1 val1 Bar A 3 2.0
9 1 val1 Bar B 1 0.0
10 1 val1 Bar B 2 9.0
11 1 val1 Bar B 3 2.0
12 2 val2 Foo A 1 2.0
13 2 val2 Foo A 2 3.0
14 2 val2 Foo A 3 8.0
15 2 val2 Foo B 1 1.0
16 2 val2 Foo B 2 5.0
17 2 val2 Foo B 3 3.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.