Merge multiple columns in pandas but move old column headers to align with the data

Question

I am using Pandas on csv files and am trying to find code to get a large data set where we are describing the temperatures on each floor and trying to distinguish our locations and floor numbers. What I have now is

Building	Floor 1 Temp	Floor 3 Temp	Floor 5 Temp
G	60.370	70.869	56.784
H	43.57	54	65.5

to:

Building	Floor	Temp
G	1	60.370
G	3	70.869
G	5	56.784
H	1	43.57
H	3	54.0
H	5	65.5

I am good on making my csv into a df but can't figure out how to merge the temperature column while keeping the data organized. Thank you in advance for your help.

Answer 1

You can use melt , and then only keep numerical values for the Floor column by using .str.replace('\D+', '', regex=True) :

df = df.melt(id_vars='Building', var_name='Floor', value_name='Temp').sort_values('Building')
df['Floor'] = df['Floor'].str.replace('\D+', '', regex=True)
df
Out[1]: 
  Building Floor    Temp
0        G     1  60.370
2        G     3  70.869
4        G     5  56.784
1        H     1  43.570
3        H     3  54.000
5        H     5  65.500

Answer 2

You can use .stack() to maintain the order of Building without sorting, as follows:

df1 = df.set_index('Building').stack().rename_axis(['Building', 'Floor']).reset_index(name='Temp')
df1['Floor'] = df1['Floor'].str.split(' ').str[1]

print(df1)



  Building Floor    Temp
0        G     1   60.37
1        G     3  70.869
2        G     5  56.784
3        H     1   43.57
4        H     3      54
5        H     5    65.5

Answer 3

You can use the pivot_longer function from pyjanitor to reshape the data:

df.pivot_longer(index='Building', 
                names_to = ('Floor', '.value'), 
                names_pattern = r".+(\d)\s(.+)",
                sort_by_appearance = True)
 
  Building Floor    Temp
0        G     1  60.370
1        G     3  70.869
2        G     5  56.784
3        H     1  43.570
4        H     3  54.000
5        H     5  65.500

The .value pairs with Temp and keeps that as a column name, while any data not associated with .value is passed into the Floor column. The data association is done via the names_pattern argument, which is a regular expression.

You could achieve the same thing with pd.wide_to_long , with a bit of massaging:

step1 = [ent.split() for ent in df]
step2 = [ent[::-1][:-1] if 'Floor' in ent else ent for ent in step1]
step3 = map("_".join, step2)
df.columns = [*step3]

df
 
  Building  Temp_1  Temp_3  Temp_5
0        G   60.37  70.869  56.784
1        H   43.57  54.000  65.500

pd.wide_to_long(df, 
                stubnames='Temp', 
                i='Building', 
                j='Floor', 
                sep="_").reset_index()
 
  Building  Floor    Temp
0        G      1  60.370
1        H      1  43.570
2        G      3  70.869
3        H      3  54.000
4        G      5  56.784
5        H      5  65.500

Merge multiple columns in pandas but move old column headers to align with the data

Question

3 answers

solution1
1 ACCPTED 2021-04-09 18:47:08

solution2
1 2021-04-09 18:51:10

solution3
1 2021-04-09 21:20:52

Merge multiple columns in pandas but move old column headers to align with the data

Question

3 answers

solution1 1 ACCPTED 2021-04-09 18:47:08

solution2 1 2021-04-09 18:51:10

solution3 1 2021-04-09 21:20:52

solution1
1 ACCPTED 2021-04-09 18:47:08

solution2
1 2021-04-09 18:51:10

solution3
1 2021-04-09 21:20:52