I have a data frame of values varying over time. For example, the number of cars I observe on a street:
df = pd.DataFrame(
[{'Orange': 0, 'Green': 2, 'Blue': 1},
{'Orange': 2, 'Green': 4, 'Blue': 4},
{'Orange': 1, 'Green': 3, 'Blue': 10}
])
I want to create graphs that highlight the cars with the highest values. So I sort by maximum value.
df.loc[:, df.max().sort_values(ascending=False).index]
Blue Green Orange
0 1 2 0
1 4 4 2
2 10 3 1
I'm using seaborn to create these graphs. From what I understand I need to melt this representation to a tidy format.
tidy = pd.melt(df.reset_index(), id_vars=['index'], var_name='color', value_name='number')
index color number
0 0 Blue 1
1 1 Blue 4
2 2 Blue 10
3 0 Green 2
4 1 Green 4
5 2 Green 3
6 0 Orange 0
7 1 Orange 2
8 2 Orange 1
How can I add a column that represents the column order before the data frame was melted?
index color number importance
0 0 Blue 1 0
1 1 Blue 4 0
2 2 Blue 10 0
3 0 Green 2 1
4 1 Green 4 1
5 2 Green 3 1
6 0 Orange 0 2
7 1 Orange 2 2
8 2 Orange 1 2
I see that I can still find the maximum columns after melting, but I'm not sure how to add that as a new column to the data frame:
tidy.groupby('color').number.max().sort_values(ascending=False).index
Index(['Blue', 'Green', 'Orange'], dtype='object', name='color')
EDIT To clarify, I'm plotting this on a line graph.
axes = sns.relplot(data=tidy, x='index', y='number', hue='color', kind="line")
This is what the graph currently looks like:
I want to use the importance data to either: color / bold the lines, or split the graph into multiple graphs, so it looks something like these
You can make a MultiIndex
on the columns, then stack both levels.
# Map color to importance
d = (df.max().rank(method='dense', ascending=False)-1).astype(int)
df.columns = pd.MultiIndex.from_arrays([df.columns, df.columns.map(d)],
names=['color', 'importance'])
#color Orange Green Blue
#importance 2 1 0
#0 0 2 1
#1 2 4 4
#2 1 3 10
df = df.rename_axis(index='index').stack([0,1]).to_frame('value').reset_index()
index color importance value
0 0 Blue 0 1.0
1 0 Green 1 2.0
2 0 Orange 2 0.0
3 1 Blue 0 4.0
4 1 Green 1 4.0
5 1 Orange 2 2.0
6 2 Blue 0 10.0
7 2 Green 1 3.0
8 2 Orange 2 1.0
Another option builds on the melt that you have and derives the importance column later:
tidy["importance"] = tidy["color"].map(df.columns.to_list().index)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.