简体   繁体   中英

How to sort subset of rows in Pandas data frame

I have the following data frame:

import pandas as pd
df = pd.DataFrame({'FavCol' : ['Fixy','Macky', 'querk', 'alber'],
                   'sample1' : [20.3, 25.3,3.1,3],
                   'sample2' : [130, 150,173,4],        
                   'sample3' : [1.0, 2.0,12.0,4],         
                   })

Which looks like this:

In [12]: df
Out[12]:
  FavCol  sample1  sample2  sample3
0   Fixy     20.3      130        1
1  Macky     25.3      150        2
2  querk      3.1      173       12
3  alber      3.0        4        4

What I want to do is to sort (case insensitive) the data frame based on FavCol but keeping the first row Fixy intact. Resulting this:

  FavCol  sample1  sample2  sample3
    Fixy     20.3      130        1
   alber      3.0        4        4
   Macky     25.3      150        2
   querk      3.1      173       12

How can I achieve that?

Update

I have problem reproducing [user:John Galt]. With this data:

Group No.   Abbr. of test substance Route   Time (hrs)  Dose (/body)    Conc.   Volume of dosage (/body)    # of mouse
1   PBS DMSO5%  i.d.    6   0 mg    0 mg/ mL    0.1 mL  3
2   MPLA    i.d.    6   0.01 mg 0.1 mg/ mL  0.1 mL  3
3   MALP2s  i.d.    6   0.01 mg 0.1 mg/ mL  0.1 mL  3
4   R848    i.d.    6   0.1 mg  1 mg/ mL    0.1 mL  3
5   DMXAA   i.d.    6   0.1 mg  1 mg/ mL    0.1 mL  3

And this code:

import pandas as pd
df = pd.read_table("http://dpaste.com/0JPC984.txt")
colnames = df.columns.values.tolist()
print colnames
fixed_rown = colnames[1]
df['lower'] = df[fixed_rown].str.lower()
df.loc[1:] = df[1:].sort('lower')
df

It produces this:

Out[35]:
   Group No. Abbr. of test substance Route  Time (hrs) Dose (/body)  \
0          1              PBS DMSO5%  i.d.           6         0 mg
1          2                    MPLA  i.d.           6      0.01 mg
2          3                  MALP2s  i.d.           6      0.01 mg
3          4                    R848  i.d.           6       0.1 mg
4          5                   DMXAA  i.d.           6       0.1 mg

        Conc. Volume of dosage (/body)  # of mouse       lower
0    0 mg/ mL                   0.1 mL           3  pbs dmso5%
1  0.1 mg/ mL                   0.1 mL           3        mpla
2  0.1 mg/ mL                   0.1 mL           3      malp2s
3    1 mg/ mL                   0.1 mL           3        r848
4    1 mg/ mL                   0.1 mL           3       dmxaa

In [45]: pd.__version__
Out[45]: '0.16.1'

dmxaa didn't come out after the fixed pbs dmso5% .

lowercase sorting is bit tricky. So, you could create a new lower column from FavCol

In [83]: df['lower'] = df['FavCol'].str.lower()

Use .loc to add the sort order

In [84]: df.loc[1:] = df[1:].sort('lower').values

In [85]: df
Out[85]:
  FavCol  sample1  sample2  sample3  lower
0   Fixy     20.3      130        1   fixy
1  alber      3.0        4        4  alber
2  Macky     25.3      150        2  macky
3  querk      3.1      173       12  querk

You can drop lower column if you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM