简体   繁体   中英

How to sort strings with numbers in Pandas?

I have a Python Pandas Dataframe, in which a column named status contains three kinds of possible values: ok , must read x more books , does not read any books yet , where x is an integer higher than 0 .

I want to sort status values according to the order above.

Example:

  name    status
0 Paul    ok
1 Jean    must read 1 more books
2 Robert  must read 2 more books
3 John    does not read any book yet

I've found some interesting hints, using Pandas Categorical and map but I don't know how to deal with variable values modifying strings.

How can I achieve that?

Use:

a = df['status'].str.extract('(\d+)', expand=False).astype(float)

d = {'ok': a.max() + 1, 'does not read any book yet':-1}

df1 = df.iloc[(-df['status'].map(d).fillna(a)).argsort()]
print (df1)
     name                      status
0    Paul                          ok
2  Robert      must read 2 more books
1    Jean      must read 1 more books
3    John  does not read any book yet

Explanation :

  1. First extract integers by regex \\d+
  2. Then dynamically create dictionary for map non numeric values
  3. Replace NaN s by fillna for numeric Series
  4. Get positions by argsort
  5. Select by iloc for sorted values

You can use sorted with a custom function to calculate the indices which would be sort an array (much like numpy.argsort ). Then feed to pd.DataFrame.iloc :

df = pd.DataFrame({'name': ['Paul', 'Jean', 'Robert', 'John'],
                   'status': ['ok', 'must read 20 more books',
                              'must read 3 more books', 'does not read any book yet']})

def sort_key(x):
    if x[1] == 'ok':
        return -1
    elif x[1] == 'does not read any book yet':
        return np.inf
    else:
        return int(x[1].split()[2])

idx = [idx for idx, _ in sorted(enumerate(df['status']), key=sort_key)]

df = df.iloc[idx, :]

print(df)

     name                      status
0    Paul                          ok
2  Robert      must read 3 more books
1    Jean     must read 20 more books
3    John  does not read any book yet

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM