简体   繁体   中英

Rearrange Pandas Dataframe using Pandas.Melt to take multiple columns at once?

I have a pandas Dataframe set out as follows. There are in fact 192 X,Y,Z triplet columns, this is just the first three.

    shot    V0e         V0n        V0d   S0_Pe     S0_Pn      S0_Pd  S0_Se     S0_Sn      S0_Sd
0   1001    457950.4    7331695.2   2.5  458004.5  7331794.1  2.2    457950.4  7331695.2  2.1
1   1002    457948.0    7331689.4   2.3  457999.5  7331782.5  2.3    457993.6  7331792.8  2.3
2   1003    457945.6    7331683.5   2.4  457999.5  7331782.5  2.4    457945.6  7331683.5  2.6
3   1004    457943.3    7331677.8   2.3  457995.4  7331770.8  2.3    457988.8  7331781.2  2.5
4   1005    457940.9    7331672.1   2.2  457995.4  7331770.8  2.6    457948.0  7331689.4  2.4

What I want to do is rearrange them to look as follows so I can animate them in a Plotly animation. To do this in need the data in Long format.

Shot Easting    Northing    Depth
1001 457950.4   7331695.2   2.5     #V0e,V0n,V0d
1001 458004.5   7331794.1   2.2     #S0_Pe,S0_Pn,S0_Pd
1001 457950.4   7331695.2   2.1     #S0_Se,S0_Sn,S0_Sd
1002 457948.0   7331689.4   2.3     #V0e,V0n,V0d
1002 457999.5   7331782.5   2.3     #S0_Pe,S0_Pn,S0_Pd
1002 457993.6   7331792.8   2.3     #S0_Se,S0_Sn,S0_Sd
1003 457945.6   7331683.5   2.4     #V0e,V0n,V0d
1003 457999.5   7331782.5   2.4     #S0_Pe,S0_Pn,S0_Pd
1003 457945.6   7331683.5   2.6     #S0_Se,S0_Sn,S0_Sd
1004 457943.3   7331677.8   2.3     #V0e,V0n,V0d
1004 457995.4   7331770.8   2.3     #S0_Pe,S0_Pn,S0_Pd
1004 457995.4   7331770.8   2.3     #S0_Se,S0_Sn,S0_Sd
1005 457943.3   7331677.8   2.3     #V0e,V0n,V0d
1005 457995.4   7331770.8   2.3     #S0_Pe,S0_Pn,S0_Pd
1005 457948.0   7331689.4   2.4     #S0_Se,S0_Sn,S0_Sd

This is also acceptable.

Shot Easting    Northing    Depth
1001 457950.4   7331695.2   2.5     #V0e,V0n,V0d
1002 457948.0   7331689.4   2.3     #V0e,V0n,V0d
1003 457945.6   7331683.5   2.4     #V0e,V0n,V0d
1004 457943.3   7331677.8   2.3     #V0e,V0n,V0d
1005 457943.3   7331677.8   2.3     #V0e,V0n,V0d
1001 458004.5   7331794.1   2.2     #S0_Pe,S0_Pn,S0_Pd
1002 457999.5   7331782.5   2.3     #S0_Pe,S0_Pn,S0_Pd 
1003 457999.5   7331782.5   2.4     #S0_Pe,S0_Pn,S0_Pd
1004 457995.4   7331770.8   2.3     #S0_Pe,S0_Pn,S0_Pd
1005 457995.4   7331770.8   2.3     #S0_Pe,S0_Pn,S0_Pd
1001 457950.4   7331695.2   2.1     #S0_Se,S0_Sn,S0_Sd 
1002 457993.6   7331792.8   2.3     #S0_Se,S0_Sn,S0_Sd
1003 457945.6   7331683.5   2.6     #S0_Se,S0_Sn,S0_Sd
1004 457995.4   7331770.8   2.3     #S0_Se,S0_Sn,S0_Sd
1005 457948.0   7331689.4   2.4     #S0_Se,S0_Sn,S0_Sd

I have looked at pandas.melt.

dfMELT=pd.melt(df,id_vars=['shot'],value_vars=["V0e","V0n","V0d","S0_Pe","S0_Pn","S0_Pd","S0_Se","S0_Sn","S0_Sd"]) 

but it doesn't rearrange as above, it only works with 3 columns at a time so my X is dealt with, then my Y then my Z. This clearly won't work for plotting coordinates in a Scatter plot.

    shot    variable    value
0   1001    V0e         457950.4
1   1002    V0e         457948.0
2   1003    V0e         457945.6
3   1004    V0e         457943.3
....
    shot    variable    value
779 1780    V0e         456009.1
780 1001    V0n         7331695.2
781 1002    V0n         7331689.4

You can use pivot_longer from pyjanitor to abstract the process. Your columns have a pattern (some end with e , some with d , some with n ). Let's pass a list of regular expressions that capture this pattern:

# pip install janitor
import janitor 
import pandas as pd
df.pivot_longer(index='shot', 
                names_to = ("Easting", "Northing", "Depth"), 
                names_pattern = (r".+e$", r".+n$", r".+d$"))

    shot   Easting   Northing  Depth
0   1001  457950.4  7331695.2    2.5
1   1002  457948.0  7331689.4    2.3
2   1003  457945.6  7331683.5    2.4
3   1004  457943.3  7331677.8    2.3
4   1005  457940.9  7331672.1    2.2
5   1001  458004.5  7331794.1    2.2
6   1002  457999.5  7331782.5    2.3
7   1003  457999.5  7331782.5    2.4
8   1004  457995.4  7331770.8    2.3
9   1005  457995.4  7331770.8    2.6
10  1001  457950.4  7331695.2    2.1
11  1002  457993.6  7331792.8    2.3
12  1003  457945.6  7331683.5    2.6
13  1004  457988.8  7331781.2    2.5
14  1005  457948.0  7331689.4    2.4

And if you want it in the order of appearance:

df.pivot_longer(index='shot', 
                names_to = ("Easting", "Northing", "Depth"), 
               names_pattern = (r".+e$", r".+n$", r".+d$"), 
               sort_by_appearance = True)
 
    shot   Easting   Northing  Depth
0   1001  457950.4  7331695.2    2.5
1   1001  458004.5  7331794.1    2.2
2   1001  457950.4  7331695.2    2.1
3   1002  457948.0  7331689.4    2.3
4   1002  457999.5  7331782.5    2.3
5   1002  457993.6  7331792.8    2.3
6   1003  457945.6  7331683.5    2.4
7   1003  457999.5  7331782.5    2.4
8   1003  457945.6  7331683.5    2.6
9   1004  457943.3  7331677.8    2.3
10  1004  457995.4  7331770.8    2.3
11  1004  457988.8  7331781.2    2.5
12  1005  457940.9  7331672.1    2.2
13  1005  457995.4  7331770.8    2.6
14  1005  457948.0  7331689.4    2.4

You could also use the .value approach - any values associated with .value stays as a header:

(df.pivot_longer(index = 'shot', 
                 names_to = ".value", 
                 names_pattern = r".+(.)$")
   .rename(columns = {"e" : "Easting", 
                      "n" : "Northing", 
                      "d" : "Depth"}
          )
  )
 
    shot   Easting   Northing  Depth
0   1001  457950.4  7331695.2    2.5
1   1002  457948.0  7331689.4    2.3
2   1003  457945.6  7331683.5    2.4
3   1004  457943.3  7331677.8    2.3
4   1005  457940.9  7331672.1    2.2
5   1001  458004.5  7331794.1    2.2
6   1002  457999.5  7331782.5    2.3
7   1003  457999.5  7331782.5    2.4
8   1004  457995.4  7331770.8    2.3
9   1005  457995.4  7331770.8    2.6
10  1001  457950.4  7331695.2    2.1
11  1002  457993.6  7331792.8    2.3
12  1003  457945.6  7331683.5    2.6
13  1004  457988.8  7331781.2    2.5
14  1005  457948.0  7331689.4    2.4

pivot_longer aims to make the reshaping process easier; however, you may not be interested in importing/installing another library - let's see one way we can resolve this using pandas' wide_to_long :

First, lets rename the columns:

res = df.copy()
res = res.set_index('shot')
res = res.rename(columns = lambda col:  f"Easting_{col[:-1]}" 
                                        if col.endswith("e") 
                                        else f"Northing_{col[:-1]}" 
                                        if col.endswith("n") 
                                        else f"Depth_{col[:-1]}")

Now we can reshape:

(pd.wide_to_long(res.reset_index(), 
                 i = 'shot', 
                 stubnames = ['Easting', 'Northing', 'Depth'], 
                 j = 'wateva', 
                 sep = "_", 
                 suffix = ".+")
  .droplevel('wateva')
  .reset_index()
 )

    shot   Easting   Northing  Depth
0   1001  457950.4  7331695.2    2.5
1   1002  457948.0  7331689.4    2.3
2   1003  457945.6  7331683.5    2.4
3   1004  457943.3  7331677.8    2.3
4   1005  457940.9  7331672.1    2.2
5   1001  458004.5  7331794.1    2.2
6   1002  457999.5  7331782.5    2.3
7   1003  457999.5  7331782.5    2.4
8   1004  457995.4  7331770.8    2.3
9   1005  457995.4  7331770.8    2.6
10  1001  457950.4  7331695.2    2.1
11  1002  457993.6  7331792.8    2.3
12  1003  457945.6  7331683.5    2.6
13  1004  457988.8  7331781.2    2.5
14  1005  457948.0  7331689.4    2.4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM