简体   繁体   中英

Order values horizontally within a group. Pandas

A couple weeks ago I asked a question about masking nans in row and only leave existing values Get the column index as a value when value exists 1

I got an excellent solution provided by @jezrael (you can find it to the link above).

To addition to the original task i need to order values (ascending) based on their weights in separate table.

I will reformulate the task from beginning.

I have 2 tables:

在此处输入图片说明

and

在此处输入图片说明

I need to get a final table as we got in previous solution but with ordered values based on their weights like this

在此处输入图片说明

is it possible to incorporate the reordering code line to the existing code? or reorder then after? How to do it based on the table apart?

Thank you for any help!

Please provide your data in a form that can be copied (as below), never as pictures.

df = pd.DataFrame({
    '1a': [1] * 4 + [None] * 12,
    '3f': [None] * 5 + [1] * 2 + [None] * 9,
    '5y': [None] * 11 + [1] * 3 + [None] * 2,
    't6': [None] * 7 + [1, 1,] + [None] * 7,
    '7j': [None] * 14 + [1, 1]},
    index=range(1, 17)).T
weights = pd.Series([.5, .4, .34, .54, .12, .45, .18, .45, .34, .19, .2, .18, .12, .56, .78, .98],
                    index=range(1, 17))

This solution multiplies the dataframe (boolean indicators) by the weights, then uses a list comprehension to sort each row of the result (after first dropping nulls) and take the index. A DataFrame is created from the result.

df2 = pd.DataFrame(
        [row.dropna().sort_values().index.tolist() 
         for _, row in df.mul(weights).iterrows()], 
        index=df.index)
df2.columns = ['c{}'.format(n + 1) for n in range(df2.shape[1])]
>>> df2
    c1  c2  c3  c4
1a   3   2   1   4
3f   7   6 NaN NaN
5y  13  12  14 NaN
7j  15  16 NaN NaN
t6   9   8 NaN NaN

df.mul(weights) results in the following dataframe:

    1    2     3     4   5     6     7     8     9   10  11    12    13    14    15    16 
1a  0.5  0.4  0.34  0.54 NaN   NaN   NaN   NaN   NaN NaN NaN   NaN   NaN   NaN   NaN   NaN 
3f  NaN  NaN   NaN   NaN NaN  0.45  0.18   NaN   NaN NaN NaN   NaN   NaN   NaN   NaN   NaN
5y  NaN  NaN   NaN   NaN NaN   NaN   NaN   NaN   NaN NaN NaN  0.18  0.12  0.56   NaN   NaN
7j  NaN  NaN   NaN   NaN NaN   NaN   NaN   NaN   NaN NaN NaN   NaN   NaN   NaN  0.78  0.98
t6  NaN  NaN   NaN   NaN NaN   NaN   NaN  0.45  0.34 NaN NaN   NaN   NaN   NaN   NaN   NaN

I then iterate over each of these rows using iterrows , drop the NaNs, and sort the result and take the sorted index.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM