如何使用第一行的值拆分一列？

Question

My dataset df looks like this: 我的数据集df如下所示：

time            Open
2017-01-01      2.2475
2017-01-02      3.2180
2017-01-03      5.2128
2017-01-04      1.2128
2017-01-05      2.2128
...., ....
2017-12-31      6.7388

I want to sort the Open column but by comparing the first ROW value in increasing order. 我想对“ Open列进行排序，但要通过按递增顺序比较第一个ROW值来进行。

We will have 1st row value always on the top( 1st row ) and then sort starting the second row by comparing to 1st row to the closest value in increasing order. 我们将始终在顶部（ 1st row ）具有1st行值，然后通过将1st行与最接近的值进行比较（以递增顺序）来对第二行进行排序。 All the low value is kept on the bottom. 所有low都保留在底部。 Eg: 1.2128 例如： 1.2128

[OP seeks a method where values greater than the first row in a select column should appear sequentially and ascending from row 2 to row n, and values less than the first row should then come after n (all of the preceding values).] [OP寻求一种方法，其中大于选择列中第一行的值应顺序出现并从第2行升至第n行，然后小于第一行的值应在n之后（所有先前的值）。

For example, the new df would be: 例如，新的df将是：

time            Open
2017-01-01      2.2475
2017-01-05      2.2128
2017-01-02      3.2180
2017-01-03      5.2128
...., ....
2017-12-31      6.7388
2017-01-04      1.2128

What did I do? 我做了什么

I can sort by column doing this: 我可以按列进行排序：

df.sort_values(by='Open', ascending=False)

but that is by column . 但这是按column 。 Now how do I sort by first ROW value, which is 2.2475 现在如何按第一个ROW值2.2475

Answer 1

IIUC, given a df : IIUC，给定df ：

         time    Open
0  2017-01-01  2.2475
1  2017-01-02  3.2180
2  2017-01-03  5.2128
3  2017-01-04  1.2128
4  2017-01-05  2.2128
5  2017-12-31  6.7388

OP wants to sort as row_0 , (rows greater than row_0) , ( rows smaller than row_0) : This can be achieved using difference between each row and row_0: OP希望排序为row_0 ， (rows greater than row_0) ，（ rows smaller than row_0) ：这可以通过使用每行与row_0之间的差异来实现：

s = df['Open'].sub(df['Open'][0]).to_dict()
df.iloc[sorted(s, key = lambda x: s.get(x) < 0)]

Output: 输出：

         time    Open
0  2017-01-01  2.2475
1  2017-01-02  3.2180
2  2017-01-03  5.2128
5  2017-12-31  6.7388
3  2017-01-04  1.2128
4  2017-01-05  2.2128

Answer 2

OP is after a method where the first row of a DataFrame column is used as a baseline for a split method of column sorting: values greater than this first row should appear sequentially and ascending from row 2 to row n, and values less than the first row should then come after n (all of the preceding values). OP在使用DataFrame列的第一行作为列排序拆分方法的基准的方法之后：大于此第一行的值应顺序出现并从第2行升至第n行，而小于第一个值然后，该行应排在n（所有前述值）之后。

This can be achieved by the following function: 这可以通过以下功能实现：

df = pd.DataFrame({'time': ['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06'], 
              'Open': [2.24, 1.21, 1.51, 3.21, 5.21, 6.21]})

def pin_row_and_sort(f):
    values_above = f.loc[f['Open'] >= f['Open'].iloc[0]].sort_values(by='Open')
    values_below = f.loc[f['Open'] < f['Open'].iloc[0]].sort_values(by='Open')
    return pd.concat([values_above, values_below])

new_frame = pin_row_and_sort(df)

I'd be keen to see any improvements/suggestions on this method. 我很想看到这种方法的任何改进/建议。 Or just down-vote without explaining why :) 或者只是不投票解释原因:)

如何使用第一行的值拆分一列？

问题描述

2 个解决方案

解决方案1
1 2019-07-24 06:13:00

解决方案2
0 2019-07-25 00:02:59

如何使用第一行的值拆分一列？

问题描述

2 个解决方案

解决方案1 1 2019-07-24 06:13:00

解决方案2 0 2019-07-25 00:02:59

解决方案1
1 2019-07-24 06:13:00

解决方案2
0 2019-07-25 00:02:59